Tracking down a character encoding issue.

Permalink
I have an issue where in our C5 instance some non-ASCII characters are being translated improperly, yet I don't see this issue in a clean install.

If I insert a ♠ into the HTML of a content block via the TinyMCE HTML editor it is translated to the corresponding UTF-8 character code, ie becomes a 'spades' character when I return to the wysiwyg editor. The character still renders as the expected character.

If I save these changes from TinyMCE the character is translated to a ? (a question mark)

If I do all this on a clean C5 install I don't get the issue, the character remains the expected UTF-8 character.

If I add the themes, packages and customisations from our production C5 instance to the clean C5 install I still don't get the issue. Both installs are running locally on the same stack.

Is there some setting in the database that could be causing this encoding difference?

Thanks Marcus

mrowell
 
mrowell replied on at Permalink Reply
mrowell
I tracked down the issue to the character encoding of database.

There is a howto to fix this (seehttp://www.concrete5.org/documentation/how-tos/editors/fix-characte... ).

My research suggest that in general such a migration should also involve a character translation ( seehttp://ronaldbradford.com/blog/migrating-mysql-latin1-to-utf8-the-p... ) which the howto does not do. Is this necessary for a C5 database?