Database/table character set and collation
We need to understand whether we need very strong and likely very inefficient character set utf8 and collation utf8_general_ci for all tables  or just for particular tables. I am absolutely sure that just a few tables do need such the strong character set and collation, but I am not sure that using different character sets and collations will help. As well I am not sure whether we need such the strong collation at all.
 CREATE DATABASE db_name DEFAULT CHARACTER SET = utf8 COLLATE = utf8_general_ci;
Updated by Evgeny Novikov almost 4 years ago
- Assignee changed from Vladimir Gratinskiy to Evgeny Novikov
- Priority changed from High to Urgent
It turns out that collation does matter, e.g. for report identifiers and values of report attributes (there are modules net/netfilter/xt_dscp.ko and net/netfilter/xt_DSCP.ko in Linux 3.14).
Here it is quite reasonably explained why collation utf8_unicode_ci is better than utf8_general_ci but both of them case insensitive. The case sensitive collation is utf8_bin. That is why our first step will be to specify this inefficient collation for all tables and columns (I am going to point out this in documentation). Then we will need to investigate how can we relax this strong condition.