C#正则怎么提取HTML代码里面的数据？

Jason_guo 2010-04-25 03:00:03

<a href="http://asdf.com/n/aHVheHk=/front.htm" target="_blank">
huaxy</a>

如上，我要提取huaxy这个名字。要求完整表达这个表达式，循环提取一个字符串里面的多个名字。谢谢各位

...全文

127 4 打赏收藏转发到动态举报

写回复

用AI写文章

4 条回复

切换为时间正序

请发表友善的回复…

发表回复

sohighthesky 2010-04-25

打赏
举报

is)<a(?:(?!href=)[^>])*href=(['""]?)(?<href>[^>'""\s]+)\1[^>]*>(?<html>(?:(?!</a>).)*)</a>"

Jason_guo 2010-04-25

打赏
举报

正在分析“a[\s]+href=(? <Link>[^\s>]+)[^>]*>(? <Text>[^ <]*) </a>”－无法识别的分组构造。

-过客- 2010-04-25

打赏
举报

try...

string test = "<a href=\"http://asdf.com/n/aHVheHk=/front.htm\" target=\"_blank\">huaxy</a>";

Regex reg = new Regex(@"(?is)<a\s+href=""(?<url>[^""]*)""[^>]*>(?<text>(?:(?!</?a\b).)*)</a>");

MatchCollection mc = reg.Matches(test);

foreach (Match m in mc)

{

    //richTextBox2.Text += m.Groups["url"].Value + "\n";      //链接

    richTextBox2.Text += m.Groups["text"].Value + "\n";     //文本

}

wuyq11 2010-04-25

打赏
举报

string strPattern=@"a[\s]+href=(? <Link>[^\s>]+)[^>]*>(? <Text>[^ <]*) </a>";
MatchCollection Matches=Regex.Matches(str,strPattern,RegexOptions.IgnoreCase|RegexOptions.Compiled);
foreach(Match mc in Matches)
{
Response.Write(mc.Groups["Link"].Value.ToString().Trim());
Response.Write(mc.Groups["Text"].Value.ToString().Trim());
}
或Regex.Matches(“”, @"(?i)(?<=href=(['""]?))[^""'\s>]+(?=\1[^>]*>)");