如何抓取该网页中的表格内容?有难度,普通的小偷程序无效哦。

好记忆不如烂笔头abc 2005-09-22 01:47:09
抓取网址如下:http://www.whois.sc/221.216.169.120

问题:如何抓取该页中间表格的信息并显示。

一般的页面抓取程序似忽无效,请经过测试成功抓取后再提交回复,谢谢。
...全文
314 20 打赏 收藏 转发到动态 举报
写回复
用AI写文章
20 条回复
切换为时间正序
请发表友善的回复…
发表回复
Shewontloveme 2005-11-20
  • 打赏
  • 举报
回复
学习学习
晓疯馋曰 2005-09-23
  • 打赏
  • 举报
回复
我试过,得到的信息跟用IE打开的信息是一样的呀.
不对吗?
  • 打赏
  • 举报
回复
楼上兄弟,试过没有?不行啊。
得到的结果是:
To see the Whois Record for 221.216.169.120 you will need to sign-up for a free account. We restrict how many whois records we give out to anonymous users per day. Sorry for the precaution but we need to limit wandering robots for the protection of everyone.

不是正确的221.216.169.120 IP地址信息啊。
晓疯馋曰 2005-09-23
  • 打赏
  • 举报
回复
<%
Function bytes2BSTR(vIn)
strReturn = ""
For i = 1 To LenB(vIn)
ThisCharCode = AscB(MidB(vIn,i,1))
If ThisCharCode < &H80 Then
strReturn = strReturn & Chr(ThisCharCode)
Else
NextCharCode = AscB(MidB(vIn,i+1,1))
strReturn = strReturn & Chr(CLng(ThisCharCode) * &H100 + CInt(NextCharCode))
i = i + 1
End If
Next
bytes2BSTR = strReturn
End Function
function PostData(strurl)
dim mXmlHttp
set mXmlHttp=Server.CreateObject("Msxml2.ServerXMLHTTP")
mXmlHttp.open "GET",strurl,false
mXmlHttp.setRequestHeader "Accept", "*/*"
mXmlHttp.setRequestHeader "Accept-Language","zh-cn"
mXmlHttp.setRequestHeader "UA-CPU","x86"
mXmlHttp.setRequestHeader "Accept-Encoding","gzip, deflate"
mXmlHttp.setRequestHeader "User-Agent","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322)"
mXmlHttp.setRequestHeader "Host","www.whois.sc"
mXmlHttp.setRequestHeader "Connection","Keep-Alive"
mXmlHttp.setRequestHeader "Cookie","ss=732834423dc939d2d1db1b01d524abc5"
mXmlHttp.send
PostData=mXmlHttp.responseBody
end function
str=PostData("http://www.whois.sc/221.216.169.120")
Response.Write bytes2BSTR(str)
%>
  • 打赏
  • 举报
回复
不是吧,结果是这个吗?
To see the Whois Record for 221.216.169.120 you will need to sign-up for a free account. We restrict how many whois records we give out to anonymous users per day. Sorry for the precaution but we need to limit wandering robots for the protection of everyone.
上面的不对。

正确的应该是:
221.216.169.120

Blacklist Status: Listed - Cached Today (details)
Cached Whois: Cached today
Whois History: 2 records stored
Oldest: 2005-09-21
Newest: 2005-09-22
Record Type: IP Address
IP Location: China - Beijing - Beijing - Cncgroup Beijing Province Network
Reverse IP: No websites hosted using this IP address
Reverse DNS: not set


--------------------------------------------------------------------------------
% [whois.apnic.net node-2]
% Whois data copyright terms http://www.apnic.net/db/dbcopyright.html

inetnum: 221.216.0.0 - 221.223.255.255
netname: CNCGROUP-BJ
descr: CNCGROUP Beijing province network
descr: China Network Communications Group Corporation
descr: No.156,Fu-Xing-Men-Nei Street,
descr: Beijing 100031
country: CN
admin-c: CH455-AP
tech-c: SY21-AP
mnt-by: APNIC-HM
mnt-lower: MAINT-CNCGROUP-BJ
changed: 20031119
status: ALLOCATED PORTABLE
source: APNIC

role: CNCGroup Hostmaster
e-mail:
address: No.156,Fu-Xing-Men-Nei Street,
address: Beijing,100031,P.R.China
nic-hdl: CH455-AP
phone: +86-10-82993155
fax-no: +86-10-82993102
country: CN
admin-c: CH444-AP
tech-c: CH444-AP
changed: 20041119
mnt-by: MAINT-CNCGROUP
source: APNIC

person: sun ying
address: Beijing Telecommunication Administration
address: TaiPingHu DongLi 18, Xicheng District
address: Beijing 100031
country: CN
phone: +86-10-66198941
fax-no: +86-10-68511003
e-mail:
nic-hdl: SY21-AP
mnt-by: MAINT-CHINANET-BJ
changed: 19980824
source: APNIC
2599 2005-09-23
  • 打赏
  • 举报
回复
整页?部分?
fantiny 2005-09-23
  • 打赏
  • 举报
回复
我刚才比较了一下,我抓取出来的内容跟原来的一模一样。
  • 打赏
  • 举报
回复
顶一下
  • 打赏
  • 举报
回复
你看到的页面是没有表格里IP地址信息的页面,不是那样的。
fantiny 2005-09-22
  • 打赏
  • 举报
回复
为什么我这里可以抓取出来?
HHH3000 2005-09-22
  • 打赏
  • 举报
回复
果然,我也试了xmlhttp抓取和cdo.message抓取都不可以。。。
yyy502 2005-09-22
  • 打赏
  • 举报
回复
<%@LANGUAGE="VBSCRIPT" CODEPAGE="CP_ACP"%>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Shift-JIS">
<%
function getHTTPPage(url)
dim Http
set Http=server.createobject("MSXML2.XMLHTTP")
'Http.open "GET",url,false
Http.open "POST",url,false
Http.send()
if Http.readystate<>4 then
exit function
end if
'getHTTPPage=bytesToBSTR(Http.responseBody,"GB2312")
getHTTPPage=bytesToBSTR(Http.responseBody,"Shift-JIS")
set http=nothing
if err.number<>0 then
err.Clear
end if
end function

Function BytesToBstr(body,Cset)
dim objstream
set objstream = Server.CreateObject("adodb.stream")
objstream.Type = 1
objstream.Mode =3
objstream.Open
objstream.Write body
objstream.Position = 0
objstream.Type = 2
objstream.Charset = Cset
BytesToBstr = objstream.ReadText
objstream.Close
set objstream = nothing

End Function

Function Newstring(wstr,strng)
Newstring=Instr(lcase(wstr),lcase(strng))
if Newstring<=0 then Newstring=Len(wstr)
End Function
%>
<title></title>
</head>
<body>
<%
Dim Url,Html
Url= "http://www.whois.sc/221.216.169.120"
Html = getHTTPPage(Url)
start=Newstring(Html,"<table width=700 border=")
over=Newstring(Html,"</td></tr></table></td></tr></table>")
bodytext=mid(Html,start,over-start)
Response.write bodytext
%>
</body>
</html>
  • 打赏
  • 举报
回复
to HHH3000(蓝色爱琴海 阿信fans 001号)
试过,不行!
HHH3000 2005-09-22
  • 打赏
  • 举报
回复
这个函数试试~~

function getPageFromUrl(strPageUrl)

'参数为要抓的網址

Dim strStrem,objGetPage
ServerURL = strPageUrl
Set objGetPage = Server.CreateObject("CDO.Message")
objGetPage.CreateMHTMLBody ServerURL,31
strStrem = objGetPage.HTMLBody
Set objGetPage = Nothing
getPageFromUrl = strStrem
end function
tigerwen01 2005-09-22
  • 打赏
  • 举报
回复
有这么一个组件AspHttp可以,不过是要money买的。http://www.newasp.net/Article/asp/other/2005/2005090918375.html
  • 打赏
  • 举报
回复
菜鸟兄,没试过就贴出来啊,不行哦,期待高手出现!
victor888 2005-09-22
  • 打赏
  • 举报
回复
没有一个回答正确,我也正在找,找到了会贴出来。
fantiny 2005-09-22
  • 打赏
  • 举报
回复
<%@LANGUAGE="VBSCRIPT" CODEPAGE="CP_ACP"%>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Shift-JIS">
<SCRIPT LANGUAGE="JavaScript">
<!--
function SelText(){
var oRangeRef = document.body.createTextRange();
alert(oRangeRef.text);
}
//-->
</SCRIPT>
<%
function getHTTPPage(url)
dim Http
set Http=server.createobject("MSXML2.XMLHTTP")
'Http.open "GET",url,false
Http.open "POST",url,false
Http.send()
if Http.readystate<>4 then
exit function
end if
'getHTTPPage=bytesToBSTR(Http.responseBody,"GB2312")
getHTTPPage=bytesToBSTR(Http.responseBody,"Shift-JIS")
set http=nothing
if err.number<>0 then
err.Clear
end if
end function

Function BytesToBstr(body,Cset)
dim objstream
set objstream = Server.CreateObject("adodb.stream")
objstream.Type = 1
objstream.Mode =3
objstream.Open
objstream.Write body
objstream.Position = 0
objstream.Type = 2
objstream.Charset = Cset
BytesToBstr = objstream.ReadText
objstream.Close
set objstream = nothing

End Function
%>

<title></title>
</head>

<body onload="SelText()">
<%
Dim Url,Html
Url= "http://www.whois.sc/221.216.169.120"
Html = getHTTPPage(Url)
Response.write Html
%>
</body>
</html>
  • 打赏
  • 举报
回复
虎哥,上面的方法不行,你试试?
tigerwen01 2005-09-22
  • 打赏
  • 举报
回复
http://www.zahui.com/html/4/8351.htm

<script type="text/javascript"> var obj = {} Object.defineProperty(obj, 'txt', { get: function () { return obj },

28,406

社区成员

发帖
与我相关
我的任务
社区描述
ASP即Active Server Pages,是Microsoft公司开发的服务器端脚本环境。
社区管理员
  • ASP
  • 无·法
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧