Feature #6720
open
Database/table character set and collation
Added by Evgeny Novikov almost 9 years ago.
Updated over 7 years ago.
Description
We need to understand whether we need very strong and likely very inefficient character set utf8 and collation utf8_general_ci for all tables [1] or just for particular tables. I am absolutely sure that just a few tables do need such the strong character set and collation, but I am not sure that using different character sets and collations will help. As well I am not sure whether we need such the strong collation at all.
[1] CREATE DATABASE db_name DEFAULT CHARACTER SET = utf8 COLLATE = utf8_general_ci;
BTW there are 44 occurrences of string utf8. I am sure that the most of them can be easily replaced with ascii especially assuming implementation of #6643.
- Assignee changed from Vladimir Gratinskiy to Evgeny Novikov
- Priority changed from High to Urgent
It turns out that collation does matter, e.g. for report identifiers and values of report attributes (there are modules net/netfilter/xt_dscp.ko and net/netfilter/xt_DSCP.ko in Linux 3.14).
Here it is quite reasonably explained why collation utf8_unicode_ci is better than utf8_general_ci but both of them case insensitive. The case sensitive collation is utf8_bin. That is why our first step will be to specify this inefficient collation for all tables and columns (I am going to point out this in documentation). Then we will need to investigate how can we relax this strong condition.
- Assignee changed from Evgeny Novikov to Vladimir Gratinskiy
- Priority changed from Urgent to High
The first step was done in 6ec4ca4.
- Priority changed from High to Normal
I don't think that this is really important especially we switched to PostgreSQL.
Also available in: Atom
PDF