21,886
社区成员
发帖
与我相关
我的任务
分享
function utf8_bytes($cp) {
if ($cp > 0x10000){
# 4 bytes
return chr(0xF0 | (($cp & 0x1C0000) >> 18)).
chr(0x80 | (($cp & 0x3F000) >> 12)).
chr(0x80 | (($cp & 0xFC0) >> 6)).
chr(0x80 | ($cp & 0x3F));
}else if ($cp > 0x800){
# 3 bytes
return chr(0xE0 | (($cp & 0xF000) >> 12)).
chr(0x80 | (($cp & 0xFC0) >> 6)).
chr(0x80 | ($cp & 0x3F));
}else if ($cp > 0x80){
# 2 bytes
return chr(0xC0 | (($cp & 0x7C0) >> 6)).
chr(0x80 | ($cp & 0x3F));
}else{
# 1 byte
return chr($cp);
}
}
echo utf8_bytes(0x4e2d); //中
echo utf8_bytes(0x6587); //文
函数通过一系列的位移操作,将数值拆分按utf-8 编码规则拼装成字符串
utf-8 编码规则
U+007F 0xxxxxxx
U+07FF 110xxxxx 10xxxxxx
U+FFFF 1110xxxx 10xxxxxx 10xxxxxx
U+1FFFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx