关于网页内容抓取
public class FirstSocket
{
public static void main(String args[])
{
String strServer = "s5.warlord.cn";
String strPage = "/index.jsp";
try
{
String hostname = strServer;
int port = 80;
InetAddress addr = InetAddress.getByName(hostname);
Socket socket = new Socket(addr, port); // 建立一个Socket
// 发送命令
BufferedWriter wr = new BufferedWriter(new OutputStreamWriter(
socket.getOutputStream(), "UTF-8"));
wr.write("GET /module/map.jsp?lid=1&aid=276&tcid=undefined&session=ec4e42f7e7af HTTP/1.1");
wr.write("Accept: */*");
wr.write("Accept-Language: zh-cn");
wr.write("Referer: http://s5.warlord.cn/main/index.jsp?session=ec4e42f7e7af#");
wr.write("Accept-Encoding: gzip, deflate");
wr.write("User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)");
wr.write("Host: s5.warlord.cn");
wr.write("Connection: Keep-Alive");
wr.write("Cookie: newbieFlag=1; warlord2v3.identify=f13bd0e357e2d324; JSESSIONID=6030B5B0E7918C25D37ED5C08A733176");
wr.flush();
// 接收返回的结果
BufferedReader rd = new BufferedReader(new InputStreamReader(socket
.getInputStream()));
String line;
while ((line = rd.readLine()) != null)
{
System.out.println(line);
}
wr.close();
rd.close();
} catch (Exception e)
{
System.out.println(e.toString());
}
}
}
我想捕捉的是一个网页游戏里面的数据
在IE上输入该地址是可以正确获得内容的
但是这么写代码的话是无法捕捉到内容的
中间内容:
wr.write("GET /module/map.jsp?lid=1&aid=276&tcid=undefined&session=ec4e42f7e7af HTTP/1.1");
wr.write("Accept: */*");
wr.write("Accept-Language: zh-cn");
wr.write("Referer: http://s5.warlord.cn/main/index.jsp?session=ec4e42f7e7af#");
wr.write("Accept-Encoding: gzip, deflate");
wr.write("User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)");
wr.write("Host: s5.warlord.cn");
wr.write("Connection: Keep-Alive");
wr.write("Cookie: newbieFlag=1; warlord2v3.identify=f13bd0e357e2d324; JSESSIONID=6030B5B0E7918C25D37ED5C08A733176");
是用抓包工具从IE里抓到的内容