谁能突破百度网页或百度视频的防采集限制

xhunanpp 2007-08-08 06:05:00

一般网页面采都是用：MSXML2 或是HttpWebRequest 来实现的，
但采集百度网页或是百度的视频，就不行，不过百度的HTML页面是可以采集的，但他的动态页面就不能了，比如 http://video.baidu.com/v?word=%C3%D8%C3%DC&ct=301989888&rn=20&pn=0&db=0&s=0 ，你用采集程序没法去采集他，

以下是我的一段采集代码，

System.Net.HttpWebRequest webRequest = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(url);
System.Net.HttpWebResponse webResponse = (System.Net.HttpWebResponse)webRequest.GetResponse();
System.IO.Stream stream = webResponse.GetResponseStream();
System.IO.StreamReader streamReader = new System.IO.StreamReader(stream, System.Text.Encoding.GetEncoding(charset));
string content = streamReader.ReadToEnd();
streamReader.Close();
webResponse.Close();

return content;

但采集不了百度的视频，

谁有相关的解决方案不？

...全文

1185 47 打赏收藏转发到动态举报

写回复

用AI写文章

47 条回复

切换为时间正序

请发表友善的回复…

发表回复

qqalex 2012-03-14

打赏
举报

防采集还真麻烦.人家随时可能封IP

BaiJiaZi 2007-08-13

打赏
举报

rfdg

milo4210 2007-08-10

打赏
举报

呵呵，别客气，问题解决了就好

xhunanpp 2007-08-10

打赏
举报

非常感谢 milo4210(米罗) 兄的帮忙，其实是个小问题，解决了，呵呵

感谢大家

levin9 2007-08-09

打赏
举报

woaimary 2007-08-09

打赏
举报

wuhuabucai 2007-08-09

打赏
举报

baidu被bs了

wuhuabucai 2007-08-09

打赏
举报

..............

timess 2007-08-09

打赏
举报

偶笨，帮顶

xwk789xwk 2007-08-09

打赏
举报

学习，
up

BaiJiaZi 2007-08-09

打赏
举报

greenery 2007-08-09

打赏
举报

我也试一下先

windily 2007-08-09

打赏
举报

zhoucaifu 2007-08-09

打赏
举报

不懂帮顶

qi_ting 2007-08-09

打赏
举报

这个问题比较难解决，顶一下好了！～～

tl0352118 2007-08-09

打赏
举报

学习

fanruinet 2007-08-09

打赏
举报

没有问题呀
<html><head>
<meta http-equiv="content-type" content="text/html;charset=gb2312">
<title>百度视频搜索_秘密 </title>
<link href="/bdVideo.css" rel="stylesheet" type="text/css"/>
<script src="/bdVideo.js" language="javascript"></script>

代码：
string localDir = "D:\\";

if (!Directory.Exists(localDir))
{
Directory.CreateDirectory(localDir);
Console.WriteLine("Directory {0} Created!!!", localDir);
}

//Uncomment the next 2 lines to save cookies from the site
//CookieContainer cookieContainer = new CookieContainer();
//Uri bbsRoot = new Uri("your website");

string url = "http://video.baidu.com/v?word=%C3%D8%C3%DC&ct=301989888&rn=20&pn=0&db=0&s=0";
Uri target = new Uri(url);
try
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(target);
request.AllowAutoRedirect = false;
request.Accept = "*/*";
request.Headers.Add("Accept-Language", "zh-cn");
request.Headers.Add("Accept-Encoding", "gzip, deflate");
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)";

string content = null;
using(HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
using(StreamReader sr = new StreamReader(response.GetResponseStream(), System.Text.Encoding.Default, true))
{
content = sr.ReadToEnd();
}

}

StreamWriter sw = new StreamWriter(localDir + "Page.htm", false, System.Text.Encoding.Default);
sw.Write(content);
sw.Close();
Console.WriteLine("oh ye~~ : {0} download!!!", localDir + "Page.html");
}
catch (WebException e)
{
Console.WriteLine(e.Message);
}
catch (IOException e)
{
Console.WriteLine(e.Message);
}

gui0605 2007-08-09

打赏
举报

不懂这个...

xhunanpp 2007-08-09

打赏
举报

有谁用 axwebbrowser 控件实现不，因为用 axwebbrowser控件是能打开百度视频搜索页面，
但ASP.NET 调用是OBJECT，没法获取HTML内容。

用WINFORM 可以获取，但ASP.NET 怎么读取WINFORM 的数据呢。

试了 WebService也不能获取 axwebbrowser的脚本内容，

高手来解决一下，

goodluckalong 2007-08-09