菜鸟提问:如何取得页面的所有链接(正则表达式)
string html = "<html><a href=first.htm>first tag text</a><a href=next.htm>next tag text</a></html>";
string p = @"<A[^>]*?HREF\s*=\s*[""']?([^'"" >]+?)[ '""]?>";
MatchCollection mc = Regex.Matches(html, p, RegexOptions.IgnoreCase);
IEnumerator ienum = mc.GetEnumerator();
while ( ienum.MoveNext() ) {
Match m = (Match) ienum.Current;
CaptureCollection cc = m.Captures;
for ( int k = 0; k < cc.Count; k++ ) {
Capture c = cc[k];
Console.WriteLine( c.ToString() );
}
}
结果只能得到:
<a href=first.htm>
<a href=next.htm>
我想要的结果是:
first.htm
next.htm
请问:我该怎么做?