java 网络爬虫 使用httpclient4 https连接认证问题,求各位大神帮帮忙,急!!!
详细代码:
---1、入口login1 传入一个url,获取验证身份,第一个url为http://****.com
public static void login1(String url,int ***){ HttpClient httpClient = new DefaultHttpClient(new ThreadSafeClientConnManager()); HttpClientParams.setCookiePolicy(httpClient.getParams(), CookiePolicy.BROWSER_COMPATIBILITY); HttpHost httpHost = new HttpHost("***.com"); HttpGet httpGet = new HttpGet(url); HttpResponse response = null; try { response = httpClient.execute(httpHost, httpGet); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } if (HttpStatus.SC_OK == response.getStatusLine().getStatusCode()) { // 请求成功 // 取得请求内容 HttpEntity entity = response.getEntity(); // 显示内容 if (entity != null) { // 显示结果 try { // EntityUtils.toString(entity, "utf-8"); // System.out.println("login1::::::"+html); // httpConsume(response,httpGet,httpClient); } catch (ParseException e) { // TODO Auto-generated catch block e.printStackTrace(); } } getTxlinfo(httpClient); } } 2、加载完第一个url,将当前httpclient传给下个处理单元
private static void getTxlinfo(HttpClient httpClient){ // HttpClient httpClient = new DefaultHttpClient(); httpClient = wrapClient(httpClient);//此处将当前httpClient传给warpClient处理,转换请求模式https,自动加载证书 HttpClientParams.setCookiePolicy(httpClient.getParams(), CookiePolicy.BROWSER_COMPATIBILITY); HttpHost httpHost = new HttpHost("***.com"); HttpGet httpGet = new HttpGet("https://***.com/**/contact.jsp"); HttpResponse response = null; try { response = httpClient.execute(httpHost, httpGet); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } if (HttpStatus.SC_OK == response.getStatusLine().getStatusCode()) { // 请求成功 // 取得请求内容 HttpEntity entity = response.getEntity(); // 显示内容 if (entity != null) { // 显示结果 try { String html; try { html = EntityUtils.toString(entity, "utf-8"); System.out.println("login1::::::"+html); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } httpConsume(response,httpGet,httpClient); } catch (ParseException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } } private static void httpConsume(HttpResponse response,HttpGet httpGet,HttpClient httpClient) { try { EntityUtils.consume(response.getEntity()); } catch (IOException e) { e.printStackTrace(); } httpGet.abort(); httpGet = null; httpClient = null; }
/** * 传入一个当前的httpclient,返回一个https的httpclient */ public static HttpClient wrapClient(HttpClient base) { try { SSLContext ctx = SSLContext.getInstance("TLS"); X509TrustManager tm = new X509TrustManager() { public X509Certificate[] getAcceptedIssuers() { return null; } @Override public void checkClientTrusted(X509Certificate[] arg0, String arg1) throws CertificateException {} @Override public void checkServerTrusted(X509Certificate[] arg0, String arg1) throws CertificateException {} }; ctx.init(null, new TrustManager[] { tm }, null); SSLSocketFactory ssf = new SSLSocketFactory(ctx, SSLSocketFactory.ALLOW_ALL_HOSTNAME_VERIFIER); SchemeRegistry registry = new SchemeRegistry(); registry.register(new Scheme("https", 443, ssf)); ThreadSafeClientConnManager mgr = new ThreadSafeClientConnManager(registry); return new DefaultHttpClient(mgr, base.getParams()); } catch (Exception ex) { ex.printStackTrace(); return null; } }
程序执行抛出异常信息:
org.apache.http.client.ClientProtocolException at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:822) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:776) at https.getTxl.getTxlinfo(getTxl.java:156) at https.getTxl.login1(getTxl.java:140) at https.getTxl.getDBdata(getTxl.java:77) at https.getTxl.main(getTxl.java:39) Caused by: org.apache.http.HttpException: Scheme 'http' not registered. at org.apache.http.impl.conn.DefaultHttpRoutePlanner.determineRoute(DefaultHttpRoutePlanner.java:115) at org.apache.http.impl.client.DefaultRequestDirector.determineRoute(DefaultRequestDirector.java:721) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:358) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820) ... 5 more Exception in thread "main" java.lang.NullPointerException at https.getTxl.getTxlinfo(getTxl.java:162) at https.getTxl.login1(getTxl.java:140) at https.getTxl.getDBdata(getTxl.java:77) at https.getTxl.main(getTxl.java:39)
求大神指教。。。