现在Java的string底层到底用的UTF8还是UTF16?

simon78 2005-12-02 02:56:09
thanks
...全文
582 13 打赏 收藏 转发到动态 举报
写回复
用AI写文章
13 条回复
切换为时间正序
请发表友善的回复…
发表回复
  • 打赏
  • 举报
回复
Java's .class files use UTF-8 internally to store string literals. Data input streams and data
output streams also read and write strings in UTF-8
  • 打赏
  • 举报
回复
Unicode is a relatively inefficient encoding when most of your text consists of ASCII
characters. Every character requires the same number of bytes—two—even though some
characters are used much more frequently than others. A more efficient encoding would use
fewer bits for the more common characters. This is what UTF-8 does.
In UTF-8 the ASCII alphabet is encoded using a single byte, just as in ASCII. The next 1,919
characters are encoded in two bytes. The remaining Unicode characters are encoded in three
bytes. However, since these three-byte characters are relatively uncommon,[1] especially in
English text, the savings achieved by encoding ASCII in a single byte more than makes up for
it.
Java's .class files use UTF-8 internally to store string literals. Data input streams and data
output streams also read and write strings in UTF-8. However, this is all hidden from direct
view of the programmer, unless perhaps you're trying to write a Java compiler or parse output
of a data stream without using the DataInputStream class.
crazycy 2005-12-03
  • 打赏
  • 举报
回复
呵呵java内部使用Unicode编码
wzh0439 2005-12-03
  • 打赏
  • 举报
回复
String 是CHAR数组 应该是UTF-8
shuai002 2005-12-03
  • 打赏
  • 举报
回复
应该存储为UTF-8.
这是为了照顾英文等.一则,JAVA是他们创立的,首先要考虑到自己再考虑他人,再则,大多网络文献是用英文写的,如此可以节约空间.
但对中日韩等则是增加了空间浪费,因为CJK中的字符如存UTF-8其空间平均是UTF-16的1.5倍.
zsjin0208 2005-12-03
  • 打赏
  • 举报
回复
unicode,难道还有其它的?
slh002 2005-12-03
  • 打赏
  • 举报
回复
string的运行的编码是操作系统缺省编码,但必最终储为UTF-8.
snowmansh 2005-12-03
  • 打赏
  • 举报
回复
应该是UTF-16,如果UTF-8,何必要char是2个byte呢
zhaidafan 2005-12-03
  • 打赏
  • 举报
回复
String使用的是系统缺省的编码方式,比如说我的电脑上(中文XP)就是GBK。
可以调用java.nio.charset.defaultCharset();来查看缺省使用的编码方式,如果想得到其他方式的编码,可以调用String类的byte[] getBytes(String charsetName);
cenlmmx 2005-12-03
  • 打赏
  • 举报
回复
会不会根据安装的操作系统的字符集来决定使用UTF8或者说UTF16呢?
greenteanet 2005-12-03
  • 打赏
  • 举报
回复
unicode
simon78 2005-12-02
  • 打赏
  • 举报
回复
你这是哪跟那啊,呵呵
infowain 2005-12-02
  • 打赏
  • 举报
回复
估计是UTF-8,最近用JDOM,它产生的XML文件默认就是UTF-8的

62,629

社区成员

发帖
与我相关
我的任务
社区描述
Java 2 Standard Edition
社区管理员
  • Java SE
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧