如何修复mysql中gbk与其他编码互转时gbk增补汉字的乱码问题(比如gbk<->utf-8)
「已注销」 2017-06-09 04:05:59 简述:
当以gbk编码sql向一个utf8编码的表中写入数据时,gbk中部分增补汉字在内码中没有对应编码,导致存储到数据库中的中文变为0x3F (“?”) 。因此,想求教如何修改mysql使用的内码页来修复此问题。
期待各位达人解答!~
已确认影响范围: mysql 5.5.56,5.6.36,5.7.18;mariadb-10.2.6。
问题详细描述:
安装设置mysql5.6使用utf8字符集。
mysql> show variables like '%char%';
+--------------------------+--------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
创建数据库tmp; 创建表a
create table a (name varchar(32) not null);
mysql> SELECT character_set_name, collation_name
-> FROM information_schema.columns
-> WHERE table_schema = 'tmp'
-> AND table_name = 'a'
-> AND column_name = 'name';
+--------------------+-----------------+
| character_set_name | collation_name |
+--------------------+-----------------+
| utf8 | utf8_general_ci |
+--------------------+-----------------+
创建/tmp/gbk.sql文件,内容如下(sql语句使用gbk编码):
set names gbk;
SELECT '眼' AS `眼`;
SELECT '' AS ``;
结果为:
mysql> source /tmp/gbk.sql;
Query OK, 0 rows affected (0.00 sec)
+----+
| 眼 |
+----+
| 眼 |
+----+
1 row in set (0.00 sec)
+----+
| ? |
+----+
| |
+----+
1 row in set (0.00 sec)
mysql5.6的手册中如此说:
• MySQL's gbk is in reality “Microsoft code page 936”. This differs from the official gbk for characters A1A4 (middle dot), A1AA (em dash), A6E0-A6F5, and A8BB-A8C0.
• For a listing of gbk/Unicode mappings, see http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP936.TXT.
在CP936.TXT的最后只到0xFE4F,后面的编码完全没有 0xFE50~0XFE7E,0xFE80~0xFEA0 共计80个字符
0xFD9A 0x9FA4 #CJK UNIFIED IDEOGRAPH
0xFD9B 0x9FA5 #CJK UNIFIED IDEOGRAPH
0xFD9C 0xF92C #CJK COMPATIBILITY IDEOGRAPH
0xFD9D 0xF979 #CJK COMPATIBILITY IDEOGRAPH
0xFD9E 0xF995 #CJK COMPATIBILITY IDEOGRAPH
0xFD9F 0xF9E7 #CJK COMPATIBILITY IDEOGRAPH
0xFDA0 0xF9F1 #CJK COMPATIBILITY IDEOGRAPH
0xFE40 0xFA0C #CJK COMPATIBILITY IDEOGRAPH
0xFE41 0xFA0D #CJK COMPATIBILITY IDEOGRAPH
0xFE42 0xFA0E #CJK COMPATIBILITY IDEOGRAPH
0xFE43 0xFA0F #CJK COMPATIBILITY IDEOGRAPH
0xFE44 0xFA11 #CJK COMPATIBILITY IDEOGRAPH
0xFE45 0xFA13 #CJK COMPATIBILITY IDEOGRAPH
0xFE46 0xFA14 #CJK COMPATIBILITY IDEOGRAPH
0xFE47 0xFA18 #CJK COMPATIBILITY IDEOGRAPH
0xFE48 0xFA1F #CJK COMPATIBILITY IDEOGRAPH
0xFE49 0xFA20 #CJK COMPATIBILITY IDEOGRAPH
0xFE4A 0xFA21 #CJK COMPATIBILITY IDEOGRAPH
0xFE4B 0xFA23 #CJK COMPATIBILITY IDEOGRAPH
0xFE4C 0xFA24 #CJK COMPATIBILITY IDEOGRAPH
0xFE4D 0xFA27 #CJK COMPATIBILITY IDEOGRAPH
0xFE4E 0xFA28 #CJK COMPATIBILITY IDEOGRAPH
0xFE4F 0xFA29 #CJK COMPATIBILITY IDEOGRAPH