httpclient4.2 没能抓取到https协议的php做的网站,求牛人给个解决办法

阿飞wpf 2015-08-06 01:26:33
网站是用php写的,使用httpclient 不能够抓取网站页面,返回的连接状态是200,头是text/html;charset=utf-8,但就是没有内容不知道为什么,使用抓包工具与firebug也能够抓到响应页面,请求头也尝试了各种参数设定还是不行。



代码:

public static String getData(DefaultHttpClient httpClient, HttpGet httpGet)throws Exception {
String reValue = "";
try {
httpClient.getParams().setParameter("http.socket.timeout", 1000);
httpClient.getParams().setParameter("http.connection.timeout", 1000);
httpClient.getParams().setParameter("http.connection-manager.timeout",60l * 60l * 1000l);
// httpClient.getParams().setParameter("http.tcp.nodelay",false);
// httpClient.getParams().setParameter("Accept-Encoding", "gzip, deflate");
// httpClient.getParams().setParameter("Accept","text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
// httpClient.getParams().setParameter("Accept-Language","zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3");
// httpClient.getParams().setParameter("Host", "192.168.97.225");
// httpClient.getParams().setParameter("User-Agent","Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0");

HttpResponse response = httpClient.execute(httpGet);
int state = response.getStatusLine().getStatusCode();
if (state == HttpStatus.SC_OK) {// 如果连接成功
reValue = EntityUtils.toString(response.getEntity());

System.out.println("state:::"+state);
System.out.println("reValue:::"+reValue);
}
} catch (Exception e) {
httpGet.abort();
throw e;
} finally {
//httpClient.getConnectionManager().shutdown();// 关闭(多次取数据,所以暂时不关)
}
return reValue;
}




public static void main(String[] args) {

DefaultHttpClient httpClient = (DefaultHttpClient) getHttpClient(new DefaultHttpClient());

try {

HttpGet getPage = new HttpGet("https://192.168.97.225/");
getData(httpClient, getPage);

} catch (Exception e) {
e.printStackTrace();
}

}

public static HttpClient getHttpClient(HttpClient base) {
try {
SSLContext ctx = SSLContext.getInstance("TLS");
X509TrustManager tm = new X509TrustManager() {
public void checkClientTrusted(X509Certificate[] chain,
String authType) throws CertificateException {
}
public void checkServerTrusted(X509Certificate[] chain,String authType) throws CertificateException {}
public X509Certificate[] getAcceptedIssuers() {
return null;
}
};

ctx.init(null, new TrustManager[] { tm }, null);
SSLSocketFactory ssf = new SSLSocketFactory(ctx,SSLSocketFactory.ALLOW_ALL_HOSTNAME_VERIFIER);
ClientConnectionManager ccm = base.getConnectionManager();
SchemeRegistry sr = ccm.getSchemeRegistry();
sr.register(new Scheme("https", 443, ssf));
return new DefaultHttpClient(ccm, base.getParams());
} catch (Exception ex) {
ex.printStackTrace();
return null;
}
}

抓包截图:
响应头消息
Cache-Control
no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Connection
Keep-Alive
Content-Length
5985
Content-Type
text/html;charset=utf-8
Date
Wed, 05 Aug 2015 06:35:11 GMT
Expires
Thu, 19 Nov 1981 08:52:00 GMT
Keep-Alive
timeout=5, max=100
Pragma
no-cache
Server
Apache/2.4.10 (Unix) OpenSSL/1.0.0o mod_fcgid/2.3.9 PHP/5.5.18
Set-Cookie
PHPSESSID=29vd0u3nusbpjqne0n4ff1md70
X-Powered-By
PHP/5.5.18

请求头消息
Accept
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding
gzip, deflate
Accept-Language
zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3
Connection
keep-alive
Cookie
PHPSESSID=29vd0u3nusbpjqne0n4ff1md70
Host
192.168.97.225
User-Agent
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0

响应为登录页面
cookies
PHPSESSID 29vd0u3nusbpjqne0n4ff1md70(内容) 29vd0u3nusbpjqne0n4ff1md70(原始内容)
...全文
290 回复 打赏 收藏 转发到动态 举报
写回复
用AI写文章
回复
切换为时间正序
请发表友善的回复…
发表回复

81,095

社区成员

发帖
与我相关
我的任务
社区描述
Java Web 开发
社区管理员
  • Web 开发社区
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧