“Imamo Hrvatsku!” – MySQL patch which implements full Croatian ordering in utf8_croatian_ci and ucs2_croatian_ci collations
Great news, great news indeed. Couple of months ago, I started an open initiative to finally add support to MySQL for proper ordering using Croatian alphabet. We tried doing it on our own, but we needed to rewrite MySQL’s Unicode Collation Algorithm, and for that we really needed help from MySQL development team. How we managed to get it? Using good old “Balkan way” – the schnapps aka. rakija black vodka. :)
My mate who was working with me on our initial implementation – Ante ‘Ivoks’ Karamatić (Chief executive at Init) got drunk with Kurt von Finck (Chief Community and Communications Officer for Monty Program Ab) in Dallas last week, who passed a good word to Michael (“Monty”) Widenius (MySQL’s original author and co-founder of MySQL AB) to listen our cries for help. Monty convinced Alexander Barkov (Lead software developer at Sun Microsystems working on MySQL) to give us little help on whole Croatian ordering issue. As a result, utf8_croatian_ci and ucs2_croatian_ci collations were created and added to MySQL 6.
After a pleasant chat with Monty and Bar, they were good enough to help us with a MySQL 5.1 patch which implements full Croatian ordering in utf8_croatian_ci and ucs2_croatian_ci collations. Woohoo! :)
But the bad news is that it will take fair amount of time before MySQL-5.6 (or 6.0 for that matter) will go GA, so one have to wait before it will be possible to download a production version of MySQL with “real” Croatian support.
If you really need Croatian support, you can try patching MySQL server as we did.
More details about the patch can be found here:
- Bar’s eplanation: http://www.collation-charts.org/articles/croatian.htm
- Bar’s patch for MySQL-5.1: http://www.collation-charts.org/articles/utf8_croatian_ci.diff
Since Alexander Barkov was so kind and provided a patch for MySQL 5.1, Ante created packages for Ubuntu. He also slightly (needs further testing) modified that patch so it works with MySQL 5.0. If you need this feature, go add this PPA to your sources.list: https://edge.launchpad.net/~ivoks/+archive/mysql-hr/.
After you apply the patch, you can try it out using my test database dump. If everything went ok “use croatian; SET NAMES ‘utf8’ COLLATE ‘utf8_croatian_ci’; select rijec from test_croatian order by rijec;”, should produce output like this (switch browser view to utf8).
Any feedback from the Croatian MySQL community is greatly welcomed. Please write your comments to <Alexander.Barkov[at]Sun.COM>. Thanks!
Proof of conecpet:
mysql> select version(); +-----------------+ | version() | +-----------------+ | 5.0.51a-hr1-log | +-----------------+ mysql> use croatian; SET NAMES 'utf8' COLLATE 'utf8_croatian_ci'; select rijec from test_croatian order by rijec; +--------------+ | rijec | +--------------+ | Aboriđin | | Aboriđini | | Ante | | Branimir | | Cipela | | Čazma | | Ćevapčići | | Džak | | džak | | Džamija | | džamija | | Đak | | đak | | Đevđelija | | Inat | | Init | | Inozemstvo | | Interes | | Injekcija | | Ipsilon | | Kutina | | Livno | | Lovor | | Ljubav | | Ljubljana | | Neven | | Nivas | | Nosorog | | Njivice | | Onomatopeja | | Šišmiš | | Zagreb | | Žaba | +--------------+