utf8mb4 ✅🤜🏽 – Fun with MySQL
This is a tech post. I recently came across a problem with storing data I’d like to share.
As you might know, I earn money by running a web development agency. Years ago, I used to even work as a web programmer. Not so much anymore, but I still like to keep up a bit and learn about interesting things. One of those things is utf8mb4 in MySQL databases.
A few weeks ago I was debugging a weird Emoji problem with my colleague Maxim. He is a great programmer, I used to be an average one at best. Still, it took some time for us together to find the solution to the problem.
Why would the character ”✅” get saved into our utf8 enabled MySQL database without a problem, but “🤜🏽” would not?
After a while of changing code and seeing what would happen, – you know, standard web developer stuff, – I remembered a good friend of mine, Andreas Reich, had sent over a random “hey, check out this weird MySQL problem” article a while ago. I skimmed it but had no practical use of this new knowledge at the time.
Now I had. Here’s the solution to MySQL’s encoding problems.
Turns out, these new skin-color Emojis (and others as well) need an additional byte of storage and also MySQL is a bit broken in handling utf8 encoding.
The TL;DR of this post would be:
If you use a MySQL database (and would like to store special Emojis), always use “utf8mb4” as the character set, not the broken “utf8”. Also, best alter your existing databases.
How do you feel after reading this?
This helps me assess the quality of my writing and improve it.
Leave a Comment