如何抓取该网页中的表格内容?
抓取网址如下:http://hk.job1998.com/cn/company_detail-Company_Id=3539.html
问题:如何抓取页面中公司名称、电话、传真等信息?
我用这种方法去抓取,可是报错,请大家帮忙!
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Shift-JIS">
<%
function getHTTPPage(url)
dim Http
set Http=server.createobject("MSXML2.XMLHTTP")
'Http.open "GET",url,false
Http.open "POST",url,false
Http.send()
if Http.readystate<>4 then
exit function
end if
'getHTTPPage=bytesToBSTR(Http.responseBody,"GB2312")
getHTTPPage=bytesToBSTR(Http.responseBody,"Shift-JIS")
set http=nothing
if err.number<>0 then
err.Clear
end if
end function
Function BytesToBstr(body,Cset)
dim objstream
set objstream = Server.CreateObject("adodb.stream")
objstream.Type = 1
objstream.Mode =3
objstream.Open
objstream.Write body
objstream.Position = 0
objstream.Type = 2
objstream.Charset = Cset
BytesToBstr = objstream.ReadText
objstream.Close
set objstream = nothing
End Function
Function Newstring(wstr,strng)
Newstring=Instr(lcase(wstr),lcase(strng))
if Newstring<=0 then Newstring=Len(wstr)
End Function
%>
<title></title>
</head>
<body>
<%
Dim Url,Html
Url= "http://hk.job1998.com/cn/company_detail-Company_Id=3539.html"
Html = getHTTPPage(Url)
start=Newstring(Html,"公司名称:</td><td colspan=""2""><b>")
over=Newstring(Html,"</b></td>")
bodytext=mid(Html,start,over-start)
Response.write bodytext
%>
</body>
</html>