做页面抓取，本地文件都可以抓，但是一抓别的就报错

tinyn 2014-02-28 11:30:44

代码如下

import java.io.IOException;
import java.net.MalformedURLException;

import org.xml.sax.SAXException;

import com.meterware.httpunit.GetMethodWebRequest;
import com.meterware.httpunit.PostMethodWebRequest;
import com.meterware.httpunit.WebConversation;
import com.meterware.httpunit.WebForm;
import com.meterware.httpunit.WebLink;
import com.meterware.httpunit.WebRequest;
import com.meterware.httpunit.WebResponse;
import com.meterware.httpunit.WebTable;

public class Test {

public static void testGetHtmlContent() throws MalformedURLException,
IOException, SAXException {
System.out.println("直接获取网页内容：");
WebConversation wc = new WebConversation();
// ClientProperties client = wc.getClientProperties();
// client.setUserAgent("Mozilla;");
WebResponse wr = wc.getResponse("http://www.baidu.com/");
System.out.println(wr.getText());
}

/*
* 用get方法获取页面内容
*/
public static void testGetMethod() throws MalformedURLException,
IOException, SAXException {
System.out.println("向服务器发送数据，然后获取网页内容：");
WebConversation wc = new WebConversation();
WebRequest req = new GetMethodWebRequest("http://localhost:8080/test.html");
req.setParameter("123","aaa");
WebResponse resp = wc.getResponse(req);
System.out.println(resp.getText());
}

/*
* 用post方法获取页面内容
*/
public static void testPostMethod() throws MalformedURLException,
IOException, SAXException {
System.out.println("使用Post方式向服务器发送数据，然后获取网页内容：");
WebConversation wc = new WebConversation();
WebRequest req = new PostMethodWebRequest(
"http://localhost:8080/test.html");
req.setParameter("hsyj", "test");
// req.setParameter("password", "111111");
WebResponse resp = wc.getResponse(req);
System.out.println(resp.getText());
}

/*
* 获取模拟点击
*/
public static void testClickLink() throws MalformedURLException,
IOException, SAXException {
System.out.println("获取页面中链接指向页面的内容：");
WebConversation wc = new WebConversation();
WebResponse resp = wc.getResponse("http://localhost:8080/test.html");
WebLink link = resp.getLinkWith("阅读");
link.click();
WebResponse nextLink = wc.getCurrentPage();
System.out.println(nextLink.getText());

}

/*
* 获取页面内容的table内容
*/
public static void testTableContent() throws MalformedURLException,
IOException, SAXException {
System.out.println("获取页面中表格的内容：");
WebConversation wc = new WebConversation();
WebResponse resp = wc.getResponse("http://localhost:8080/table.html");
System.out.println(resp.getText());
WebTable webTable = resp.getTables()[0];
// 将表格对象的内容传递给字符串数组
String[][] datas = webTable.asText();
// 循环显示表格内容
int i = 0, j = 0;
int m = datas[0].length;
int n = datas.length;
while (i < n) {
j = 0;
while (j < m) {
System.out.println("表格中第" + (i + 1) + "行第" + (j + 1) + "列的内容是："
+ datas[i][j]);
++j;
}
++i;
}
}

/*
* 获取页面的表单控件内容
*/
public static void testHtmlContentForm() throws MalformedURLException,
IOException, SAXException {
System.out.println("获取页面中表单的内容：");
WebConversation wc = new WebConversation();
WebResponse resp = wc.getResponse("http://localhost:8080/test.html");
System.out.println(resp.getText());
// 获得对应的表单对象
WebForm webForm = resp.getForms()[0];
// 获得表单中所有控件的名字
String[] pNames = webForm.getParameterNames();
int i = 0;
int m = pNames.length;
// 循环显示表单中所有控件的内容
while (i < m) {
System.out.println("第" + (i + 1) + "个控件的名字是" + pNames[i] + "，里面的内容是"
+ (webForm.getParameterValues(pNames[i])));
++i;
}
}

public static void main(String[] args) throws MalformedURLException,
IOException, SAXException {
testGetHtmlContent();
// testGetMethod();
// testPostMethod();
// testClickLink();
// testTableContent();
// testHtmlContentForm();
}

}

刚刚接触java还不是很懂。之前用WebClient类写的时候就没有出现类似的问题。

...全文

471 14 打赏收藏转发到动态举报

写回复

用AI写文章

14 条回复

切换为时间正序

请发表友善的回复…

发表回复

yjafl2008 2014-09-26

打赏
举报

上面代码报错信息如下： org.mozilla.javascript.EcmaError: TypeError: Cannot call method "match" of undefined at org.mozilla.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3229) at org.mozilla.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3219) at org.mozilla.javascript.ScriptRuntime.typeError(ScriptRuntime.java:3235) at org.mozilla.javascript.ScriptRuntime.typeError2(ScriptRuntime.java:3254) at org.mozilla.javascript.ScriptRuntime.undefCallError(ScriptRuntime.java:3273) at org.mozilla.javascript.ScriptRuntime.getPropFunctionAndThis(ScriptRuntime.java:1969) at org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:2932) at script(httpunit) at org.mozilla.javascript.Interpreter.interpret(Interpreter.java:2251) at org.mozilla.javascript.InterpretedFunction.call(InterpretedFunction.java:161) at org.mozilla.javascript.ContextFactory.doTopCall(ContextFactory.java:340) at org.mozilla.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:2758) at org.mozilla.javascript.InterpretedFunction.exec(InterpretedFunction.java:172) at org.mozilla.javascript.Context.evaluateString(Context.java:1132) at com.meterware.httpunit.javascript.ScriptingEngineImpl.runScript(ScriptingEngineImpl.java:92) at com.meterware.httpunit.scripting.ScriptableDelegate.runScript(ScriptableDelegate.java:88) at com.meterware.httpunit.parsing.NekoDOMParser.runScript(NekoDOMParser.java:151) at com.meterware.httpunit.parsing.ScriptFilter.getTranslatedScript(ScriptFilter.java:150) at com.meterware.httpunit.parsing.ScriptFilter.endElement(ScriptFilter.java:131) at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:249) at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:367) at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1015) at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:888) at org.cyberneko.html.HTMLScanner$SpecialScanner.scan(HTMLScanner.java:2831) at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:809) at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:478) at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:431) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) at com.meterware.httpunit.parsing.NekoHTMLParser.parse(NekoHTMLParser.java:48) at com.meterware.httpunit.HTMLPage.parse(HTMLPage.java:271) at com.meterware.httpunit.WebResponse.getReceivedPage(WebResponse.java:1301) at com.meterware.httpunit.WebResponse.getFrames(WebResponse.java:1285) at com.meterware.httpunit.WebResponse.getFrameRequests(WebResponse.java:1024) at com.meterware.httpunit.FrameHolder.updateFrames(FrameHolder.java:179) at com.meterware.httpunit.WebWindow.updateFrameContents(WebWindow.java:315) at com.meterware.httpunit.WebClient.updateFrameContents(WebClient.java:526) at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:201) at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183) at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158) at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:125) at com.meterware.httpunit.WebClient.getResponse(WebClient.java:96) at comparedb.HttpUnitSample.testGetHtmlContent(HttpUnitSample.java:28) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) com.meterware.httpunit.ScriptException: Script 'if(!location.hash.match(/[^a-zA-Z0-9]wd=/)){document.getElementById("ftCon").style.display='block';document.getElementById("u1").style.display='block';document.getElementById("content").style.display='block';document.getElementById("wrapper").style.display='block';setTimeout(function(){try{document.getElementById("kw1").focus();document.getElementById("kw1").parentNode.className += ' iptfocus';}catch(e){}},0);}' failed: org.mozilla.javascript.EcmaError: TypeError: Cannot call method "match" of undefined at com.meterware.httpunit.javascript.ScriptingEngineImpl.handleScriptException(ScriptingEngineImpl.java:64) at com.meterware.httpunit.javascript.ScriptingEngineImpl.runScript(ScriptingEngineImpl.java:95) at com.meterware.httpunit.scripting.ScriptableDelegate.runScript(ScriptableDelegate.java:88) at com.meterware.httpunit.parsing.NekoDOMParser.runScript(NekoDOMParser.java:151) at com.meterware.httpunit.parsing.ScriptFilter.getTranslatedScript(ScriptFilter.java:150) at com.meterware.httpunit.parsing.ScriptFilter.endElement(ScriptFilter.java:131) at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:249) at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:367) at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1015) at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:888) at org.cyberneko.html.HTMLScanner$SpecialScanner.scan(HTMLScanner.java:2831) at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:809) at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:478) at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:431) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) at com.meterware.httpunit.parsing.NekoHTMLParser.parse(NekoHTMLParser.java:48) at com.meterware.httpunit.HTMLPage.parse(HTMLPage.java:271) at com.meterware.httpunit.WebResponse.getReceivedPage(WebResponse.java:1301) at com.meterware.httpunit.WebResponse.getFrames(WebResponse.java:1285) at com.meterware.httpunit.WebResponse.getFrameRequests(WebResponse.java:1024) at com.meterware.httpunit.FrameHolder.updateFrames(FrameHolder.java:179) at com.meterware.httpunit.WebWindow.updateFrameContents(WebWindow.java:315) at com.meterware.httpunit.WebClient.updateFrameContents(WebClient.java:526) at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:201) at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183) at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158) at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:125) at com.meterware.httpunit.WebClient.getResponse(WebClient.java:96) at comparedb.HttpUnitSample.testGetHtmlContent(HttpUnitSample.java:28) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)

yjafl2008 2014-09-26

打赏
举报

public class HttpUnitSample { @Test public void testGetHtmlContent() throws MalformedURLException, IOException, SAXException { System.out.println("直接获取网页内容："); // 建立一个WebConversation实例 WebConversation wc = new WebConversation(); // 向指定的URL发出请求，获取响应 WebResponse wr = wc.getResponse("http://www.baidu.com"); // 用getText方法获取相应的全部内容 // 用System.out.println将获取的内容打印在控制台上 System.out.println(wr.getText()); } public static void main(String args[]) throws MalformedURLException, IOException, SAXException{ HttpUnitOptions.setScriptingEnabled(false); new HttpUnitSample().testGetHtmlContent(); // HttpUnitSample.testGetHtmlContent(); } }

yjafl2008 2014-09-26

打赏
举报

@shunulaa 能不能帮我解决下，设置了这个还是报错，之前跑是好的，过了一段时间后再跑就报这个错了

tinyn 2014-02-28

打赏
举报

问题是我并没有写match方法，而且用本地地址测试的时候一切正常

花谢尊前不敢香 2014-02-28

打赏
举报

Cannot call method "match" of undefined match方法没有定义。

tinyn 2014-02-28

打赏
举报

[color=#800000]报错如下： org.mozilla.javascript.EcmaError: TypeError: Cannot call method "match" of undefined at org.mozilla.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3654) at org.mozilla.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3632) at org.mozilla.javascript.ScriptRuntime.typeError(ScriptRuntime.java:3660) at org.mozilla.javascript.ScriptRuntime.typeError2(ScriptRuntime.java:3679) at org.mozilla.javascript.ScriptRuntime.undefCallError(ScriptRuntime.java:3698) at org.mozilla.javascript.ScriptRuntime.getPropFunctionAndThisHelper(ScriptRuntime.java:2221) at org.mozilla.javascript.ScriptRuntime.getPropFunctionAndThis(ScriptRuntime.java:2214) at org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:3143) at script(httpunit) at org.mozilla.javascript.Interpreter.interpret(Interpreter.java:2487) at org.mozilla.javascript.InterpretedFunction.call(InterpretedFunction.java:164) at org.mozilla.javascript.ContextFactory.doTopCall(ContextFactory.java:398) at org.mozilla.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3065) at org.mozilla.javascript.InterpretedFunction.exec(InterpretedFunction.java:175) at org.mozilla.javascript.Context.evaluateString(Context.java:1104) at com.meterware.httpunit.javascript.ScriptingEngineImpl.runScript(ScriptingEngineImpl.java:92) at com.meterware.httpunit.scripting.ScriptableDelegate.runScript(ScriptableDelegate.java:88) at com.meterware.httpunit.parsing.NekoDOMParser.runScript(NekoDOMParser.java:151) at com.meterware.httpunit.parsing.ScriptFilter.getTranslatedScript(ScriptFilter.java:150) at com.meterware.httpunit.parsing.ScriptFilter.endElement(ScriptFilter.java:131) at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1169) at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1071) at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:206) at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:330) at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3074) at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2041) at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:918) at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499) at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) at com.meterware.httpunit.parsing.NekoHTMLParser.parse(NekoHTMLParser.java:48) at com.meterware.httpunit.HTMLPage.parse(HTMLPage.java:271) at com.meterware.httpunit.WebResponse.getReceivedPage(WebResponse.java:1301) at com.meterware.httpunit.WebResponse.getFrames(WebResponse.java:1285) at com.meterware.httpunit.WebResponse.getFrameRequests(WebResponse.java:1024) at com.meterware.httpunit.FrameHolder.updateFrames(FrameHolder.java:179) at com.meterware.httpunit.WebWindow.updateFrameContents(WebWindow.java:315) at com.meterware.httpunit.WebClient.updateFrameContents(WebClient.java:526) at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:201) at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183) at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158) at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:125) at com.meterware.httpunit.WebClient.getResponse(WebClient.java:96) at Test.testGetHtmlContent(Test.java:24) at Test.main(Test.java:125) Exception in thread "main" com.meterware.httpunit.ScriptException: Script 'if(!location.hash.match(/[^a-zA-Z0-9]wd=/)){document.getElementById("ftCon").style.display='block';document.getElementById("u1").style.display='block';document.getElementById("content").style.display='block';document.getElementById("wrapper").style.display='block';setTimeout(function(){try{document.getElementById("kw1").focus();}catch(e){}},0);}' failed: org.mozilla.javascript.EcmaError: TypeError: Cannot call method "match" of undefined at com.meterware.httpunit.javascript.ScriptingEngineImpl.handleScriptException(ScriptingEngineImpl.java:64) at com.meterware.httpunit.javascript.ScriptingEngineImpl.runScript(ScriptingEngineImpl.java:95) at com.meterware.httpunit.scripting.ScriptableDelegate.runScript(ScriptableDelegate.java:88) at com.meterware.httpunit.parsing.NekoDOMParser.runScript(NekoDOMParser.java:151) at com.meterware.httpunit.parsing.ScriptFilter.getTranslatedScript(ScriptFilter.java:150) at com.meterware.httpunit.parsing.ScriptFilter.endElement(ScriptFilter.java:131) at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1169) at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1071) at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:206) at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:330) at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3074) at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2041) at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:918) at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499) at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) at com.meterware.httpunit.parsing.NekoHTMLParser.parse(NekoHTMLParser.java:48) at com.meterware.httpunit.HTMLPage.parse(HTMLPage.java:271) at com.meterware.httpunit.WebResponse.getReceivedPage(WebResponse.java:1301) at com.meterware.httpunit.WebResponse.getFrames(WebResponse.java:1285) at com.meterware.httpunit.WebResponse.getFrameRequests(WebResponse.java:1024) at com.meterware.httpunit.FrameHolder.updateFrames(FrameHolder.java:179) at com.meterware.httpunit.WebWindow.updateFrameContents(WebWindow.java:315) at com.meterware.httpunit.WebClient.updateFrameContents(WebClient.java:526) at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:201) at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183) at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158) at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:125) at com.meterware.httpunit.WebClient.getResponse(WebClient.java:96) at Test.testGetHtmlContent(Test.java:24) at Test.main(Test.java:125) [/color]

tinyn 2014-02-28

打赏
举报

引用 9 楼 shnulaa 的回复:


HttpUnitOptions.setScriptingEnabled(false)

问题解决啦！太感谢你了！

晓风吹雾 2014-02-28

打赏
举报


HttpUnitOptions.setScriptingEnabled(false)

tinyn 2014-02-28

打赏
举报

引用 7 楼 shnulaa 的回复:

解析js的时候错误了，错误原因是
'if(!location.hash.match(/[^a-zA-Z0-9]wd=/)){document.getElementById("ftCon").style.display='block';document.getElementById("u1")
location.hash是undefined. 如果没有必要解析js,是不是可以ignore掉js的解析。

我也觉得是这个问题。以前我用的是WebClient类，这样写的 final WebClient webClient = new WebClient(); webClient.getOptions().setCssEnabled(false); webClient.getOptions().setJavaScriptEnabled(false); final HtmlPage page = webClient.getPage("http://www.baidu.com"); final String pageAsXml = page.asXml(); 把CSS和JS都屏蔽了。但是改用WebCoversation类之后怎么都找不到在哪里设置了。

晓风吹雾 2014-02-28

打赏
举报

解析js的时候错误了，错误原因是


'if(!location.hash.match(/[^a-zA-Z0-9]wd=/)){document.getElementById("ftCon").style.display='block';document.getElementById("u1")

location.hash是undefined. 如果没有必要解析js,是不是可以ignore掉js的解析。

tinyn 2014-02-28

打赏
举报

所以是因为httpunit不能处理js的原因么？
要怎么解决呢？

晓风吹雾 2014-02-28

打赏
举报

location.hash == undefined

Defonds 2014-02-28

打赏
举报

看看哪地方调用 match 方法了

然后又看到说非input类型可以用pyautogui，两步就能搞定，结果下载安装搞了半天，按照别人说的来写测试运行了很多遍没有报错，但是测试系统就是没有显示上传的文件，然后有人说用管理员身份运行pycharm，试过不行，...

保存文件到本地用到的模块 urllib bs4 re os 第一部分：抓取全站URL 先贴上代码 # 获取当前页面子网站子网站 def get_urls(url, baseurl, urls): with request.urlopen(url) as f: data = f.read()....

动态页面抓取前面爬取的页面都是静态页面，页面展示的内容都存储在HTML源代码中。但是，现在主流的网站都使用JavaScript 展现网页内容，和静态网页不一样的是，使用JavaScript时，很多内容都不会出现在HTML源码中...

答案当然是肯定的，通过Fiddler或者Charles这些主流的抓包工具都可以抓得到，在IOS平台抓取微信小程序和https请求都是一样的设置，接下来给大家通过Fiddler演示如何设置在IOS平台端抓取小程序数据包（Charles也是...

今天遇到一个非常尴尬的问题，接口在某种情况下会报错，此时前端会展示NAN之类的东西，由于复现不了，接口现在一直不报错了，所以就让前端做了个友好提示，当接口报错时，给个提示“请稍后重试” ，我要测试前端的...

Eclipse

58,451

社区成员

49,460

社区内容

发帖

与我相关

我的任务

社区管理员

加入社区

近7日
近30日
至今

加载中

查看更多榜单

社区公告

暂无公告

试试用AI创作助手写篇文章吧

+ 用AI写文章