C#如何使用循环抓取一个html页面的数据。在线等

nettt 2015-02-06 04:16:36

<ul>

<li class=59Feb5906590759><div>[<a href="http://PRDWEB/article/List_2.html" target="_blank" title="资材价格">资材价格</a>]<A class=592005Feb0607 href="http://PRDWEB/article/200502/461708.html" title="2005年1月6日PADW1厂消耗曲线（不完整）" target="_blank">2005年1月6日PADW1厂消耗曲线（不完整）</A></div><span><font color='red'>2005-01-06</font></span></li>

<li class=16Feb1606160816><div>[<a href="http://PRDWEB/article/List_2.html" target="_blank" title="资材价格">资材价格</a>]<A class=162005Feb0608 href="http://PRDWEB/article/200502/461743.html" title="2005年1月6日PADW1厂消耗曲线" target="_blank">2005年1月6日PADW1厂消耗曲线</A></div><span><font color='red'>2005-01-06</font></span></li>

<li class=16Feb1606160816><div>[<a href="http://PRDWEB/article/List_2.html" target="_blank" title="资材价格">资材价格</a>]<A class=162005Feb0608 href="http://PRDWEB/article/200502/461742.html" title="2005年1月6日PADW2厂消耗曲线" target="_blank">2005年1月6日PADW2厂消耗曲线</A></div><span><font color='red'>2005-01-06</font></span></li>

<li class=15Feb1506150815><div>[<a href="http://PRDWEB/article/List_2.html" target="_blank" title="资材价格">资材价格</a>]<A class=152005Feb0608 href="http://PRDWEB/article/200502/461741.html" title="2005年1月6日PADW6厂消耗曲线" target="_blank">2005年1月6日PADW6厂消耗曲线</A></div><span><font color='red'>2005-01-06</font></span></li>

<li class=15Feb1506150815><div>[<a href="http://PRDWEB/article/List_2.html" target="_blank" title="资材价格">资材价格</a>]<A class=152005Feb0608 href="http://PRDWEB/article/200502/461740.html" title="2005年1月6日PADW8厂消耗曲线" target="_blank">2005年1月6日PADW8厂消耗曲线</A></div><span><font color='red'>2005-01-06</font></span></li>

</ul>

这是好多年前保存下来的一个静态页的网站，希望能把页面上的数据抓取到数据库里
想把href和title和最后的日期抓取出来，希望能循环抓取，因为这个html页实在是太长了。。。。。。
href="http://PRDWEB/article/200502/461708.html"
title="2005年1月6日PADW1厂消耗曲线（不完整）"
2005-01-06
------
我用下面写的程序只能取出一条而且结果不干净
愁死了，请高手指教下吧。最重要的是可以循环抓取并且要干净的数据

private void button1_Click(object sender, EventArgs e)

{

	

	HTMLReader.url = "http://PRDWEB/Mat.html";

	HTMLReader.MidKey = "";

	HTMLReader.headLength = 0;



	HTMLReader.startKey = "<div>[<a href=\"http://PRDWEB/article/List_2.html\" target=\"_blank\" title=\"资材价格\">资材价格</a>]";

	

	HTMLReader.endKey = " title=\"2005年";

	string html = HTMLReader.GetHtml();

	string mgsy = HTMLReader.GetValue();



	///\d{6}\D{3}\d{4}

}



 public static string GetValue()

    {

        try

        {

            string html;

            string HtmlText = strHtml;

            if (HtmlText == null) return null;

            int start, end;

            start = HtmlText.IndexOf(startKey);

            end = HtmlText.IndexOf(endKey);

            html = HtmlText.Substring(start + startKey.Length, end - start - startKey.Length);

            return html;

        }

        catch

        {

            return "";

        }

    }