用Java解析xml时,遇到了一个这样的问题!

Tosan 2002-02-01 09:32:25

大家好,我是一个Java和XML的初学者,昨天我用Java解析XML的DOM树时,遇到一个问题,十分苦恼.请各位大虾帮帮忙,看我的问题出在哪里.
我的开发环境是JBuilder4.0,JDK1.3,JAXP1.0,从sun公司下载的.我想解析DOM树,再解析前,想无视空白区域,所以我吧DocumentBuilderFactory的WhiteSpace设成了true,即setIgnoringElementContentWhitespace(true);可是程序运行时,还是把空白区间当作一个Node处理了.后来我又把setIgnoringElementContentWhitespace(false)试了一下,结果还是一样,也就是setIgnoringElementContentWhitespace(boolean)这个方法没起作用.后来我有检查了一下setValidating(boolean)这个方法,无论它是true还是false,空白区间还是去不掉,请问大家这是怎么回事?我真的不想把空白区间当作Node处理.微软提供的msxml却没有这个问题,只要把preserveWhiteSpace设定就可以了.我的源代码和xml文件是这样的.

--------------------------------------------------------------------------
import java.io.*;
import java.util.jar.*;
import org.xml.sax.*;
import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
/**
* It is a example that to load xml file<BR>
* you can get the interface JAXP from http://java.sun.com/xml/
* @author Tosan
* @version 1.0 write at 2002.01.22
*/
public class HelloWorld {
public static void main(String[] args) {
try {
int chlNum;
String nName;
String nValue;
Node nNode;
NodeList nList;

//get the file name and path
String userDir = System.getProperty("user.dir");
String sFileName = userDir + "\\TestLoad.xml";
//new a DocumentBuilderFactory thread
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
//don't verify the DTD
dbf.setValidating(false);
//don't read comments
dbf.setIgnoringComments(true);

//don't verify the whitespace ???????????????
dbf.setIgnoringElementContentWhitespace(true);

DocumentBuilder builder = dbf.newDocumentBuilder();
//get the document file
Document doc = builder.parse(sFileName);
//get the dom tree
nList = doc.getChildNodes();
nNode = nList.item(0);//root node
nName = nNode.getNodeName();
nValue = nNode.getNodeValue();
nList = nNode.getChildNodes();
chlNum = nList.getLength(); //length = 3, I think it is wrong. I want to it is 1!!!!
for (int i = 0; i < chlNum; i++) {
nNode = nList.item(i);
System.out.println(nNode.getNodeName());
System.out.println(nNode.getNodeValue());
}
........
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (DOMException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}
}
}
__________________________________________________________________________
<<TestLoad.xml>>
--------------------------------------------------------------------------
<?xml version="1.0" encoding="Shift_JIS"?>
<root>
<title attr0="0" attr1="1" attr2="2">
<content date="2002/01/18" />
</title>
</root>
--------------------------------------------------------------------------

...全文

187 13 打赏收藏转发到动态举报

写回复

用AI写文章

13 条回复

切换为时间正序

请发表友善的回复…

发表回复

flytsu 2002-02-01

打赏
举报

呵呵，不客气，祝你好运！

Tosan 2002-02-01

打赏
举报

Thank you my friend --flytsu(卡休)!! I'll go there to download it now!

flytsu 2002-02-01

打赏
举报

不好意思，地址错了。
是：http://xml.apache.org/dist/xerces-j/Xerces-J-tools.2.0.0.zip

你在下载完成后打开压缩包，找到xerces.jar。
然后把它复制到你的jdk路径下的lib目录下。
再执行命令 jar xvf xerces.jar
这样你就可以用了。

flytsu 2002-02-01

打赏
举报

大概是解析器的问题吧。
我用java做xml解析时都是用的xerces-j。现在它的最新版本是xerces-j2.0beta4.
你下载一个试试。
下载地址http://xml.apache.org/dist/xerces-j/Xerces-J-bin.2.0.0.zip

Tosan 2002-02-01

打赏
举报

请大虾skyyoung(路人甲),你能说更具体一点吗?另外我怎么给分给别人啊?

Arzu 2002-02-01

打赏
举报

可以换个解析器试试啊；
我都是过屡处理的。

Arzu 2002-02-01

打赏
举报

加个条件判断试试：
nNode.getNodeType() == node.TEXT_NODE
&&nNode.getNodeValue() == "\n"
（我没有试过，随便写的）

skyyoung 2002-02-01

打赏
举报

改用apache 的parse.

Tosan 2002-02-01

打赏
举报

To Arzu(大米):
你说的这种方法我也考虑过,当真的没有被的办法在创建DOM的时候就无视空白区间吗?微软提供的msxml有这项功能啊!!!

Arzu 2002-02-01

打赏
举报

一般而言，解析是很忠诚的把你的文件解析出来。
所以总需要加上一些自己的条件，筛选一下。

Tosan 2002-02-01

打赏
举报

空白区不是人为设置的,而是DOM中有这么一个概念他把一个回车也当作一个子node处理.
谁来救救我啊!!

Arzu 2002-02-01

打赏
举报

试试这样：
for (int i = 0; i < chlNum; i++) {
nNode = nList.item(i);
System.out.println(nNode.getNodeName());
System.out.println(nNode.getNodeValue());
}
添一句，变成：
for (int i = 0; i < chlNum; i++) {
nNode = nList.item(i);
if(nNode.getNodeType() != node.TEXT_NODE){
System.out.println(nNode.getNodeName());
System.out.println(nNode.getNodeValue());
}
}
解析的时候会把回车看作是TEXT_NODE，所以需要过滤一些节点。

skyyoung 2002-02-01