httpclient验证码登录后提示登录超时

lucky811 2016-05-26 05:51:58
最近研究爬虫,使用httpclient手动输入验证码后匿名登录网站,第一个页面登录正常,在点击下一个链接时提示登录超时,cookie也设置了,实在是没有办法,跪求指点,部分代码如下:
//文件开头已经设置过httpClient.getParams().setCookiePolicy(CookiePolicy.BROWSER_COMPATIBILITY);
//验证码验证成功,保存COOKIE
if(g1.getStatusCode() == 200 && sg31.indexOf("{\"msg\": \"success\", \"status\": 1}")!=-1){
String cookiesimage3 = "" ;
Cookie[] cookiesChildimage3 = httpClient.getState().getCookies();
for (Cookie c : cookiesChildimage3) {
cookiesimage3 += c.toString() + ";" ;

}

System.out.println("3:"+cookiesimage3);

//进入下一页面,到这一步登录是成功的
gmy = new GetMethod("http://xyq.cbg.163.com/static_file/558/buy_equip_list/equip_list1.html") ;
gmy.setRequestHeader("Accept", "text/html, application/xhtml+xml, */*") ;
gmy.setRequestHeader("Accept-Language", "zh-CN") ;
gmy.setRequestHeader("Connection", "Keep-Alive") ;
gmy.setRequestHeader("Host", "res.xyq.cbg.163.com") ;//Proxy-Connection: keep-alive
gmy.setRequestHeader("Referer", "http://xyq.cbg.163.com/") ;
gmy.setRequestHeader("User-Agent", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)") ;
//gmy.setRequestHeader("Cookie", cookiesimage3) ;
gmy.setRequestHeader("Content-Type", "application/x-www-form-urlencoded;charset=GBK") ;


httpClient.executeMethod(gmy) ;

String sgzf1 = gmy.getResponseBodyAsString() ;
sgzf1 = new String(sgzf1.getBytes("ISO-8859-1"),"GBK");

if(gmy.getStatusCode() == 200 ){

String cookiesimage4 = "" ;
Cookie[] cookiesChildimage4 = httpClient.getState().getCookies();
for (Cookie c : cookiesChildimage4) {
cookiesimage4 += c.toString() + ";" ;

}
//经过验证,cookiesimage4和cookiesimage4是一样的
int a = cookiesimage4.compareTo(cookiesimage3);
System.out.println("4:"+cookiesimage4+a+"");

//这里准备进入第二个页面,结果会提示登录超时
gmy2 = new GetMethod("http://xyq.cbg.163.com/cgi-bin/query.py?act=query&server_id=558&areaid=58&server_name=%C7%E0%BB%A8%B4%C9&page=1&query_order=&kindid=23&kind_depth=2") ;

gmy2.setRequestHeader("Accept", "text/html, application/xhtml+xml, */*") ;
gmy2.setRequestHeader("Accept-Language", "zh-CN") ;
gmy2.setRequestHeader("Connection", "Keep-Alive") ;
gmy2.setRequestHeader("Host", "res.xyq.cbg.163.com") ;//Proxy-Connection: keep-alive

gmy2.setRequestHeader("Referer", "http://xyq.cbg.163.com/") ;

gmy2.setRequestHeader("User-Agent", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)") ;
gmy2.setRequestHeader("Cookie",cookiesimage3) ;

httpClient.executeMethod(gmy2) ;
//结果会提示登录超时,跪求问题出在哪?
//文件开头已经设置过httpClient.getParams().setCookiePolicy(CookiePolicy.BROWSER_COMPATIBILITY);
...全文
385 6 打赏 收藏 转发到动态 举报
写回复
用AI写文章
6 条回复
切换为时间正序
请发表友善的回复…
发表回复
keygod1 2016-06-03
  • 打赏
  • 举报
回复
sid可能是js里生成的,看看他是怎么生成的,,
lucky811 2016-06-01
  • 打赏
  • 举报
回复
看来我对网页还很不了解,这几天好好研究了下,发现网页交互不只是看到的下一步链接那么简单,中间客户端和服务器要交互好多次,新的问题是:
一点“梦幻币”链接,http请求头中就有sid值,可之前所有的链接我都查了,真不知这个值从哪来的
这是第一个页面所能得到的所有cookie:
lucky811 2016-05-27
  • 打赏
  • 举报
回复
大神在哪里?
lucky811 2016-05-27
  • 打赏
  • 举报
回复
如果验证码页面不通过,直接用httpclient去访问第一个页面也是无法正常得到结果的,但我第一个页面可以正常获取,说明cookie是正确的啊,为何在通过httpClient.executeMethod(gmy2)访问第二个页面时才出错呢?
lucky811 2016-05-27
  • 打赏
  • 举报
回复
谢谢回复,关键是我验证码页面已经通过了,而且顺利访问了一个页面,之后再点击下一链接的时候才出的问题
lucky811 2016-05-26
  • 打赏
  • 举报
回复
得到的html页面如下: <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=gb2312" /> <title>梦幻西游藏宝阁</title> <link href="http://res.xyq.cbg.163.com/rcc9d06907b1e19227c588/css/global.css?v=0111" rel="stylesheet" type="text/css" media="all" /> <link href="http://res.xyq.cbg.163.com/rcc9d06907b1e19227c588/css/pages.css?v=0620" rel="stylesheet" type="text/css" media="all" /> <script type="text/javascript" src="http://res.xyq.cbg.163.com/rcc9d06907b1e19227c588/js/mootools-core-1.3.2.js"></script> <script type="text/javascript" src="http://res.xyq.cbg.163.com/rcc9d06907b1e19227c588/js/mootools-jsonp.js"></script> <script type="text/javascript" src="http://res.xyq.cbg.163.com/rcc9d06907b1e19227c588/js/autocomplete.js"></script> <script type="text/javascript" src="http://res.xyq.cbg.163.com/advertise/ad_data/cbg_bottom_ad.js"></script> <script type="text/javascript" src="http://res.xyq.cbg.163.com/rcc9d06907b1e19227c588/js/util.js?v=0617"></script> <script type="text/javascript" src="http://res.xyq.cbg.163.com/js/sprite.js"></script> <script type="text/javascript" src="http://res.xyq.cbg.163.com/js/sprite-hotq.js"></script> <script type="text/javascript"> var CgiRootUrl = 'http://xyq.cbg.163.com/cgi-bin'; var HttpsCgiRootUrl = 'https://xyq.cbg.163.com/cgi-bin'; var ResUrl = 'http://res.xyq.cbg.163.com'; </script> </head> <body> <div class="areaTop"></div> <div class="header area hasLayout"> <div class="logo"> <h1><a href="http://xyq.cbg.163.com" title="梦幻西游藏宝阁首页">梦幻西游藏宝阁</a></h1> </div> <a href="http://xyq.cbg.163.com/app/?from=cbgtoplink" class="mobileLink" target="_blank">手机版下载</a> </div> <div class="areaBtm"></div> <div class="areaTop"></div> <div class="subArea area"> <!-- 其他各种提示类型的框开始 --> <div class="blank12"></div> <div class="win"> <div class="blockTitle"> <h3 class="f14px fB">操作提示</h3> </div> <div class="blockCont"> <p class="cDYellow tips">登录超时,请重新登录!</p> <div class="blank12"></div> <p class="textCenter"> <input type="button" value="重新登录" id="login_btn" class="btn1" /> </p> <script type="text/javascript"> var server_info = {"serverid": 558, "areaid": 58}; if(server_info["serverid"]){ $('login_btn').addEvent('click', function(){ var url = 'http://xyq.cbg.163.com/cgi-bin/show_login.py?act=show_login&server_id='+server_info["serverid"]; var server_name = Cookie.read('cur_servername'); if(server_name) url += '&server_name=' + server_name; window.location = url; return; }); } else{ $('login_btn').addEvent('click', function(){ window.location = '/'; return; }); } </script> </div> </div> <!-- 其他各种提示类型的框结束 --> </div> <div class="areaBtm"></div> <!--页脚--> <script>adjust_table_row_style();</script> <div class="areaTop"></div> <div class="footer area"> 手机藏宝阁地址:m.cbg.163.com<br /> <script> __NIE_copyRight_siteName="xyq"; __NIE_copyRight_whiteStyle=true; </script> <script language="JavaScript" type="text/javascript" src="http://res.nie.netease.com/comm/NIE_copyRight/NIE_copyRight.js"></script> </div> <div class="areaBtm"></div> <div class="blank12"></div> <script src="http://analytics.163.com/ntes.js" type="text/javascript"></script> <script type="text/javascript"> _ntes_nacc = "cbgxyq"; //站点ID。 neteaseTracker(); </script> </body> </html>

81,091

社区成员

发帖
与我相关
我的任务
社区描述
Java Web 开发
社区管理员
  • Web 开发社区
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧