As an internationalized solution, the EchoSystem Server (ESS) relies on its database using the UTF-8 character set. Some default installations of MySQL 5.x, however, use the Latin-1 character set. This article discusses one approach to convert an existing MySQL database from the Latin-1 character set to the UTF-8 character set.
A detailed discussion of character sets and encoding is beyond the scope of this article. For background on these topics, please see the following article: http://en.wikipedia.org/wiki/Character_encoding. For detailed information on MySQL character set support, please see: http://dev.mysql.com/doc/refman/5.5/en/charset.html.
There are many intricacies involved with converting a MySQL database from one character set to another. Unfortunately this article cannot hope to cover all these aspects. Customers using an external MySQL database and a language whose alphabet differs significantly from English (which includes Russian, Greek, Hebrew, Arabic, and almost all Asian languages) should contact Echo360 Technical Support for guidance on converting their database.
The most straightforward method to convert the character set on an existing MySQL database is to export it, search for all occurrences of
"CHARSET=latin1", replace them with
"CHARSET=utf8", drop the existing database, re-create it, and then re-import the modified dump file to re-populate the database. This article will use the command-line MySQL utilities to accomplish these tasks. The MySQL utilities are the preferred method for dealing with character set issues, since graphical MySQL clients may mask or misrepresent character set problems.
- Stop the ESS service.
- Perform a complete backup of the ESS database using the
#> mysqldump -u [username] -p [database] > essdata_latin1.sql
- Search for all occurrences of
"CHARSET=latin1"in the dump file and replace them with
#> replace "CHARSET=latin1" "CHARSET=utf8" "SET NAMES latin1" "SET NAMES utf8" < essdata_latin1.sql > essdata_utf8.sql
- Connect to the MySQL server and drop the existing ESS database:
#> mysql -u [username] -p mysql> DROP DATABASE [database]; Query OK, 0 rows affected (0.00 sec)
- Re-create the database:
mysql> CREATE DATABASE [database]; Query OK, 1 row affected (0.00 sec)
- Explicitly define the default character set for the schema:
mysql> ALTER DATABASE [database] DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec)
- Connect to the database and populate it with the converted data:
mysql> CONNECT [database]; Connection id: 48 Current database: [database] mysql> SOURCE essdata_utf8.sql Query OK, 0 rows affected (0.00 sec) ...
- Verify that the output of the
SOURCEcommand contains no errors.
- Restart the ESS service.
- The Wikipedia overview of character sets and encoding: http://en.wikipedia.org/wiki/Character_encoding.
- MySQL character set support: http://dev.mysql.com/doc/refman/5.1/en/charset.html.
- The excellent article, "Getting out of MySQL Character Set Hell," written by Stephen Balukoff, CTO, Blue Box Group, LLC. provides a detailed discussion of the issues surrounding the conversion of a MySQL database to the UTF-8 character set: http://www.blueboxgrp.com/news/2009/07/mysql_encoding.