C5 encoding

Permalink
I just can't get the characters to display correctly. I tried checking database charsets, collations and everything seems to be utf8. Interesting thing:
There's a single page: 'single_pages/whatever.php', which prints out some data from the database.
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    </head>
    <?php
    $host = "localhost";
    $user = "user";
    $dbasename = "name";
    $password = "pass";
    $conn = mysql_connect($host, $user, $password) or die("Unable to connect to SQL server");
    mysql_select_db($dbasename) or die("Unable to select database");
    $sql = "SELECT title FROM articles";
    $titles = mysql_query($sql, $conn);
    while ($row = mysql_fetch_assoc($titles)) {
        print $row['title'] . "</br>";


Now if I access the page through C5: example.com/whatever, it doesn't display the characters correctly. But if I go to it directly: example.com/single_pages/whatever.php - everything seems to be fine. Any thoughts?

egnexas
 
jordanlev replied on at Permalink Reply
jordanlev
This can be very tricky to solve. I would double-check the database -- even if you changed the default charset and collation to utf-8, if the database was originally created with a different encoding, the tables will still have the old encoding (changing the defaults only makes it so newly-added tables get that -- you have to manually go back and change existing tables). Be careful though -- make sure you back up the database first, because I'm not sure if it will actually convert the data to the new encoding when you change it, or if it leaves the data as it was but now can't interpret it properly because it thinks it's interpreted differently than it actually is.

Other than that, make sure that your PHP installation is set up to handle unicode and that the "APP_CHARSET" setting in concrete5 is set properly as well.

Finally, I know that some Asian languages have too many characters to even work with UTF-8 and instead require UTF-16, so depending on what language your data is in, it might not be able to work with UTF-8 at all (but this is unlikely in my experience -- just suggesting it as a possibility).

Good luck.
surefyre replied on at Permalink Reply
surefyre
This is a real pain.

I have charset = utf8 in the /aplication/config/database.php
I have table default charset utf8
I have db default charset =utf8

Yet show variables like 'char%'; will still show the db connection as latin1

In the end I had to also execute an explicit SET NAMES 'utf8' at the top of some scripts which were expected to return UTF-8 text fields. Give that a try.

If you're making your own DB connections to MySQL (using PDO) then you will also want your connection statement to possibly look like:
$db = new PDO('mysql:host=somehost;dbname=db_name;charset=utf8', 'user', 'pass', array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES 'utf8'"));


Note how utf8 is specified in two places there, sigh, it seemed to need it.