JAVA 输出中文,结果全是问号

goodblackpen2 2011-05-30 10:17:33
package jed;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;

public class jk {
public static String sendGet(String url,String param) throws IOException{
String result="";
BufferedReader in=null;
String urlName=url+param;
URL realUrl=new URL(urlName);
URLConnection conn=realUrl.openConnection();
conn.setRequestProperty("accept", "*/*");
conn.setRequestProperty("connection", "keep-Alive");
conn.setRequestProperty("user-agent", "Mozilla/4.0(compatible;MSIE 6.0;Window NT 5.1;SV1)");
conn.connect();
in=new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
while((line=in.readLine())!=null){
result+="\n"+line;

}


return result;


}
public static void main(String args[]) throws IOException{

String s=jk.sendGet("http://dict-co.iciba.com/api/dictionary.php?w=", "word");
byte[] b=s.getBytes("ISO-8859-1");
s =new String(b,"GBK");
System.out.print(s);


}





}
这段代码是利用金山词霸给的接口来查词,返回的是XML,但是应该是中文的地方全是问号。
...全文
4418 11 打赏 收藏 转发到动态 举报
写回复
用AI写文章
11 条回复
切换为时间正序
请发表友善的回复…
发表回复
java爱好者 2011-05-31
  • 打赏
  • 举报
回复


import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class jk {
public static String sendGet(String url, String param) throws IOException {
String result = "";
BufferedReader in = null;
String urlName = url + param;
URL realUrl = new URL(urlName);
URLConnection conn = realUrl.openConnection();
conn.setRequestProperty("accept", "*/*");
conn.setRequestProperty("connection", "keep-Alive");
conn.setRequestProperty("user-agent",
"Mozilla/4.0(compatible;MSIE 6.0;Window NT 5.1;SV1)");
conn.connect();
in = new BufferedReader(new InputStreamReader(conn.getInputStream(),
"UTF-8"));
String line;
while ((line = in.readLine()) != null) {
result += "\n" + line;
}
return result;

}

public static void main(String args[]) throws IOException {

String s = jk.sendGet("http://dict-co.iciba.com/api/dictionary.php?w=",
"word");
System.out.println(s);
System.out.print(wordMeanning(s));

}

public static String wordMeanning(String word) {
Pattern p = Pattern.compile("<acceptation>(.*?)</acceptation>");
Matcher m = p.matcher(word);
StringBuffer wordMeanning = new StringBuffer();
while (m.find()) {
wordMeanning.append(m.group(1));
wordMeanning.append(", ");
}
return wordMeanning.delete(wordMeanning.length() - 2,
wordMeanning.length() - 1).toString();
}

}
lh_fengyuzhe 2011-05-30
  • 打赏
  • 举报
回复

String s1 = "<acceptation>";
String s2 = "</acceptation>";
String s3 = s.substring(s.indexOf(s1) + s1.length(), s.indexOf(s2));

如果只有一次的话,这个比较简单,如果有多次出现的话,解析XML或者用正则看看
ZZZ5512536 2011-05-30
  • 打赏
  • 举报
回复
出现乱码了大部分都是编码的问题。一般来说,将取流和输出流都设置成相同的编码,就能避免这样的问题了。
goodblackpen2 2011-05-30
  • 打赏
  • 举报
回复
大神们能不能帮小题写个代码,取出<acceptation></acceptation>之间的词义?
goodblackpen2 2011-05-30
  • 打赏
  • 举报
回复
6楼真是厉害!果然!在控制台里返回的都是正确的了!高手!
lh_fengyuzhe 2011-05-30
  • 打赏
  • 举报
回复
in = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
取流的时候设置编码为UTF-8,返回的直接是中文。
然后String s=jk.sendGet("http://dict-co.iciba.com/api/dictionary.php?w=", "word");
直接打印s就可以了,不需要编码转换
goodblackpen2 2011-05-30
  • 打赏
  • 举报
回复
太感谢你了,我试试。现在的问题是,怎么从string中读取词义部分,用正则表达式?对我这个菜鸟有难度!
java爱好者 2011-05-30
  • 打赏
  • 举报
回复
把你的字符编码设置成UTF-8;输出的xml文件默认的是UTF-8格式的字符编码。

s =new String(b,"GBK"); 换成 s =new String(b,"UTF-8");

如果还不是中文的话,那就是控制台得编码不是UTF-8格式的。你可以试试输出成文件试试

public class jk {
public static String sendGet(String url, String param) throws IOException {
String result = "";
BufferedReader in = null;
String urlName = url + param;
URL realUrl = new URL(urlName);
URLConnection conn = realUrl.openConnection();
conn.setRequestProperty("accept", "*/*");
conn.setRequestProperty("connection", "keep-Alive");
conn.setRequestProperty("user-agent",
"Mozilla/4.0(compatible;MSIE 6.0;Window NT 5.1;SV1)");
conn.connect();
in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
FileWriter fw = new FileWriter("src/a.txt");
while ((line = in.readLine()) != null) {
result += "\n" + line;
fw.write(line);
}
fw.close();
return result;

}

public static void main(String args[]) throws IOException {

String s = jk.sendGet("http://dict-co.iciba.com/api/dictionary.php?w=",
"word");
byte[] b = s.getBytes("ISO-8859-1");
s = new String(b, "UTF-8");
System.out.print(s);

}

}

控制台输出的结果:

<?xml version="1.0" encoding="UTF-8"?><dict num="219" id="219" name="219"><key>word</key><ps>w?:d</ps><pron>http://res.iciba.com/resource/amp3/c/4/c47d187067c6cf953245f128b5fde62a.mp3</pron><pos>n.</pos><acceptation>?, ?, ?, ??, ??, ??</acceptation><pos>vt.</pos><acceptation>?...??</acceptation><sent><orig>smear word</orig><pron>http://res.iciba.com/resource/phrase_mp3/9/2/92aa6d061c7a42c045dfe255e54b68a5.mp3</pron><trans>??????</trans></sent><sent><orig>new word</orig><pron>http://res.iciba.com/resource/phrase_mp3/d/0/d018ff527432e94d60b959410aa4523c.mp3</pron><trans>??</trans></sent><sent><orig>word picture</orig><pron>http://res.iciba.com/resource/phrase_mp3/4/9/499ee15da204af581a41c26728702c3a.mp3</pron><trans>???????</trans></sent><sent><orig>unfamiliar word</orig><pron>http://res.iciba.com/resource/phrase_mp3/5/5/55f564a2ef3f2a27e44c3cfffe7fef24.mp3</pron><trans>??</trans></sent><sent><orig>word stress</orig><pron>http://res.iciba.com/resource/phrase_mp3/5/d/5d61b1e8dc6c4bc322e49d78ddc63479.mp3</pron><trans>????</trans></sent></dict>


文件中的代码:

<?xml version="1.0" encoding="UTF-8"?><dict num="219" id="219" name="219"><key>word</key><ps>wə:d</ps><pron>http://res.iciba.com/resource/amp3/c/4/c47d187067c6cf953245f128b5fde62a.mp3</pron><pos>n.</pos><acceptation>字, 词, 话, 消息, 诺言, 命令</acceptation><pos>vt.</pos><acceptation>为...措辞</acceptation><sent><orig>smear word</orig><pron>http://res.iciba.com/resource/phrase_mp3/9/2/92aa6d061c7a42c045dfe255e54b68a5.mp3</pron><trans>诬蔑性的字眼</trans></sent><sent><orig>new word</orig><pron>http://res.iciba.com/resource/phrase_mp3/d/0/d018ff527432e94d60b959410aa4523c.mp3</pron><trans>生词</trans></sent><sent><orig>word picture</orig><pron>http://res.iciba.com/resource/phrase_mp3/4/9/499ee15da204af581a41c26728702c3a.mp3</pron><trans>生动的文字描述</trans></sent><sent><orig>unfamiliar word</orig><pron>http://res.iciba.com/resource/phrase_mp3/5/5/55f564a2ef3f2a27e44c3cfffe7fef24.mp3</pron><trans>冷字</trans></sent><sent><orig>word stress</orig><pron>http://res.iciba.com/resource/phrase_mp3/5/d/5d61b1e8dc6c4bc322e49d78ddc63479.mp3</pron><trans>单词重音</trans></sent></dict>



sly_1 2011-05-30
  • 打赏
  • 举报
回复
好难 不懂
goodblackpen2 2011-05-30
  • 打赏
  • 举报
回复
对!我该怎么改?我试着改成UTF-8的,虽然不是问号了,但都成繁体字(一个我都不认识的)了。
rmouse_2005 2011-05-30
  • 打赏
  • 举报
回复
返回的页面时utf-8的吧
<?xml version="1.0" encoding="UTF-8"?>

62,614

社区成员

发帖
与我相关
我的任务
社区描述
Java 2 Standard Edition
社区管理员
  • Java SE
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧