从含有中文的字符串中提取中文

guodong66 2009-07-16 09:33:31

有如下字符串 String str = "123abc这个中文cde123abc也要提取123ab";

今天看到的一个题，从这个字符串中提取中文部分。如何解决？正则表达式么？

...全文

431 6 打赏收藏转发到动态举报

写回复

用AI写文章

6 条回复

切换为时间正序

请发表友善的回复…

发表回复

guodong66 2009-07-16

打赏
举报

结贴送分。

shibenjie 2009-07-16

打赏
举报

运行结果：
这个中文
也要提取

shibenjie 2009-07-16

打赏
举报

public static void main(String[] args) {

String str = "123abc这个中文cde123abc也要提取123ab";
Pattern p = null;
Matcher m = null;
String value = null;

p = Pattern.compile("([\u4e00-\u9fa5]+)");
m = p.matcher(str);

while (m.find()) {
value = m.group(0);
System.out.println(value);
}

}

sd5816690 2009-07-16

打赏
举报



String str = "123abc这个中文cde123abc也要提取123ab";

System.out.println(str.replaceAll("[^\u4e00-\u9fa5]", ""));

tenderuser 2009-07-16

打赏
举报

可以先将你的字符串转换为字符数组，然后判断每一个字符的askII码，中文的在unicode中有一个特定的范围。。。这样可以判断，至于用正则不太会。。。。

lioushuei 2009-07-16

打赏
举报

public static String getChineseCharacter(String str) throws Exception{
StringBuffer outStr = new StringBuffer();
byte[] bytes = str.getBytes("Unicode");
byte[] tmp = new byte[4];
int i = 0;
int len = bytes.length;
tmp[0] = -1;
tmp[1] = -2;
for ( i = 2 ; i <= ( len - 2 ) ; i += 2){
if ( bytes[i+1] != 0 ){
tmp[2] = bytes[i];
tmp[3] = bytes[i+1];
outStr.append(new String(tmp,"Unicode"));
}
}
return outStr.toString();
}

按照编码区分