mac下的utf8，unicode的头尾识别！

ocwind 2010-09-30 09:59:01

请问下在mac下，怎么识别utf8和unicode编码。。。。

...全文

132 2 打赏收藏转发到动态举报

写回复

用AI写文章

2 条回复

切换为时间正序

请发表友善的回复…

发表回复

陌尘笑 2010-10-07

打赏
举报

[Quote=引用 1 楼 cloudhsu 的回复:]
Unicode vs UTF-8

The development of Unicode was aimed at creating a new standard for mapping the characters in a great majority of languages that are being used today, along with other characters t……
[/Quote]
都是英文的。有没有obj-c常用几种编码格式详解及他们的转换比如utf8是 1－4字节＝＝之类

云瑀 2010-09-30

打赏
举报

Unicode vs UTF-8

The development of Unicode was aimed at creating a new standard for mapping the characters in a great majority of languages that are being used today, along with other characters that are not that essential but might be necessary for creating the text. UTF-8 is only one of the many ways that you can encode the files because there are many ways you can encode the characters inside a file into Unicode.

UTF-8 was developed with compatibility in mind. ASCII was a very prominent standard and people who already had their files in the ASCII standard might hesitate in adopting Unicode because it would break their current systems. UTF-8 eliminated this problem as any file encoded that only has characters in the ASCII character set would result in an identical file, as if it was encoded with ASCII. This allowed people to adopt Unicode without needing to convert their files or even changing their current legacy software that was unaware of the Unicode standard. Any of the other mapping methods for Unicode breaks compatibility with ASCII and would force people to convert their system.

The observance of compatibility to ASCII of UTF-8 produces a side-effect that makes it ideal for word processing where most of the time, all the characters being used are included in the ASCII character set. UTF-8 only uses a byte to represent every code point resulting in a file size that is half to the same file encoded in UT-16 which uses 2 bytes, and a quarter to the same file encoded in UTF-32 which uses 4.

UTF-8 has been adopted in the World Wide Web because it is both space efficient and byte oriented. Web pages are often simple text files that usually do not contain any character that is outside the ASCII character set. Using other encoding methods would only increase the network load without any benefit. Even in email transport systems, UTF-8 is slowly but surely being adopted as a replacement for the older encoding systems that are still being used.

Summary:
1. Unicode is the standard for computers to display and manipulate text while UTF-8 is one of the many mapping methods for Unicode
2. UTF-8 is a mapping method the retains compatibility with the older ASCII
3. UTF-8 is the most space efficient mapping method for Unicode compared to other encoding methods
4. UTF-8 is the most used Unicode standard for the web

Read more: Difference Between Unicode and UTF-8 | Difference Between | Unicode vs UTF-8 http://www.differencebetween.net/technology/difference-between-unicode-and-utf-8/#ixzz10yaUmBgm
请用

//js decodeURIComponent 貌似对GB2312编码的格式不识别，必须转为utf-8才可以，然后，如果字符串中有空格的就转为 + 号了 var ds = '$str;?>'; var dddd= decodeURIComponent (ds); alert(dddd); 1 2 3 4...

getprotobyname()获取下各个协议对应的数字 /etc/protocols 在这也能看到对应的数字 ip:0 icmp:1 ggp:3 tcp:6 egp:8 pup:12 udp:17 hmp:20 xns-idp:22 rdp:27 3.vim在当前行迅速插入到下一行,普通模式下按o 4....

另外可能你会发现虽然可以输出中文字符串，但并不能用char c=’中’来声明一个中文字符，如果用strlen(“C语言”)来求字符串长度，给出的值为5，因此默认情况下C并不能很好的支持中文。如何让C语言很好的支持中文呢...

HTML编码规范 1 （一）命名规则： 2 3 头：header 4 内容：content/container 5 尾：footer 6 导航：nav 7 侧栏：sidebar 8 栏目：column 9...

#### 四安装Cpython解释器 Python解释器目前已支持所有主流操作系统，在Linux,Unix,Mac系统上自带Python解释器，在Windows系统上需要安装一下，具体步骤如下。 #### 4.1、下载python解释器 ...