紧急求助正则表达式提取链接和替换内容

tangzhong 2011-07-25 04:46:56
代码片段如下:<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/51062169/home"> 明明 </a>
<time datetime="1年前">1年前</time>
<a href="javascript:;" id="2_link_reply_1268192440_1" class="link_reply" onclick="QOM.FP.showCommentBox('1268192440_1',2);return false;">回复</a>
<a href="javascript:;" id="2_link_hide_1268192440_1" class="link_reply none" onclick="QOM.FP.hideCommentBox();return false;">收起回复</a>
</div>
<div class="mudule_comment_detail"> 希望能增加留言和相册备份 </div>

<ol class="sub_comment " id="2_sub_comment_1268192440_1">

<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo2.store.qq.com/qzone/51062169/51062169/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/51062169/home"> 明明 </a>
<time datetime="1年前">回复 1年前</time>
</div>
<div class="mudule_comment_detail"> 还有说说等
</div>
</div>
</li>

<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo1.store.qq.com/qzone/405797768/405797768/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/405797768/home"> 非狐 </a>
<time datetime="1年前">回复 1年前</time>
</div>
<div class="mudule_comment_detail"> 谢谢你的建议,正在开发中,敬请期待<img src="http://qzs.qq.com/qzone/em/e181.gif"/> </div>
</div>
</li>


<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo2.store.qq.com/qzone/59991021/59991021/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/59991021/home"> ♂Jay_小翟♀ </a>
<time datetime="05月01日 02:24">05月01日 02:24</time>
<a href="javascript:;" id="2_link_reply_1268192440_3" class="link_reply" onclick="QOM.FP.showCommentBox('1268192440_3',2);return false;">回复</a>
<a href="javascript:;" id="2_link_hide_1268192440_3" class="link_reply none" onclick="QOM.FP.hideCommentBox();return false;">收起回复</a>
</div>
<div class="mudule_comment_detail"> 远程服务器返回错误:(404)未找到
。。。。
什么情况? </div>

<ol class="sub_comment " id="2_sub_comment_1268192440_3">

<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo1.store.qq.com/qzone/405797768/405797768/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/405797768/home"> 非狐 </a>
<time datetime="06月07日 20:22">回复 06月07日 20:22</time>
</div>
<div class="mudule_comment_detail"> 请下载最新版本软件,并在参数里面设置为完全备份即可~ </div>
</div>
</li>

</ol>

<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo2.store.qq.com/qzone/59991021/59991021/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/59991021/home"> ♂Jay_小翟♀ </a>
<time datetime="06月07日 21:11">06月07日 21:11</time>
<a href="javascript:;" id="2_link_reply_1268192440_6" class="link_reply" onclick="QOM.FP.showCommentBox('1268192440_6',2);return false;">回复</a>
<a href="javascript:;" id="2_link_hide_1268192440_6" class="link_reply none" onclick="QOM.FP.hideCommentBox();return false;">收起回复</a>
</div>
<div class="mudule_comment_detail"> 新版本?是这个吗?http://dl.dbank.com/c0ttdot198
不行啊 </div>

<ol class="sub_comment " id="2_sub_comment_1268192440_6">

<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo1.store.qq.com/qzone/405797768/405797768/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/405797768/home"> 非狐 </a>
<time datetime="06月08日 10:15">回复 06月08日 10:15</time>
</div>
<div class="mudule_comment_detail"> 请到http://www.sbys.org.cn下载最新版本~ </div>
</div>
</li>

<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo2.store.qq.com/qzone/59991021/59991021/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/59991021/home"> ♂Jay_小翟♀ </a>
<time datetime="06月08日 10:34">回复 06月08日 10:34</time>
</div>
<div class="mudule_comment_detail"> 这个域名和主机是哪家的?价格多少呀?用着怎样? </div>
</div>
</li>

<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo2.store.qq.com/qzone/59991021/59991021/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/59991021/home"> ♂Jay_小翟♀ </a>
<time datetime="06月08日 16:11">回复 06月08日 16:11</time>
</div>
<div class="mudule_comment_detail"> 我表示 我没找到下载的地方 </div>
</div>
</li>

<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo1.store.qq.com/qzone/405797768/405797768/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/405797768/home"> 非狐 </a>
<time datetime="06月09日 09:45">回复 06月09日 09:45</time>
</div>
<div class="mudule_comment_detail"> 直接加我的QQ吧~ </div>
</div>
</li>
</ol>


<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo1.store.qq.com/qzone/94028028/94028028/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/94028028/home"> min'er </a>
<time datetime="07月07日 13:40">07月07日 13:40</time>
<a href="javascript:;" id="2_link_reply_1268192440_10" class="link_reply" onclick="QOM.FP.showCommentBox('1268192440_10',2);return false;">回复</a>
<a href="javascript:;" id="2_link_hide_1268192440_10" class="link_reply none" onclick="QOM.FP.hideCommentBox();return false;">收起回复</a>
</div>
<div class="mudule_comment_detail"> <img src="/qzone/em/e106.gif">我觉得,如果备份出来的日志有时间信息会更好 </div>

<ol class="sub_comment " id="2_sub_comment_1268192440_10">

<li class="mod_interact">
<div class="mod_interact_avatar"><span class="avatar_round"></span><img src="http://qlogo1.store.qq.com/qzone/405797768/405797768/30" alt="pic" /></div>
<div class="mod_interact_main">
<div class="mudule_comment_cont">
<a href="http://qz.qq.com/405797768/home"> 非狐 </a>
<time datetime="07月08日 11:15">回复 07月08日 11:15</time>
</div>
<div class="mudule_comment_detail"> 好,马上就更新,谢谢你<img src="http://qzs.qq.com/qzone/em/e181.gif"/> </div>
</div>
</li>
</div>
</div>
想要提取超链接后的名字比如“非狐”,还有后面的内容(其实就是评论)。
还有一个,就是把类似[img]love[/img]替换成lovo.gif,不知道怎么做?恳请高手指点。谢谢!!
...全文
132 8 打赏 收藏 转发到动态 举报
写回复
用AI写文章
8 条回复
切换为时间正序
请发表友善的回复…
发表回复
tangzhong 2011-07-26
  • 打赏
  • 举报
回复
我自己搞定了第二个问题了,正则表达式如下:"\[em\](?<img>[a-z0-9]*)\[/em\]",原来用分组就可以了。完整代码:ublic Function myReplace(ByVal Str As String) As String

'设置匹配公式

Dim strPattern As String = "\[em\](?<img>[a-z0-9]*)\[/em\]"
'Dim strPattern As String = "\[em\][a-z0-9]*\[/em\]"

'声明一个不可变的正则表达式

Dim oRegex As New Regex(strPattern, RegexOptions.Multiline)

'声明一个表示返回的单个匹配值

Dim oMatch As Match

'声明一个表示所有匹配值得集合

Dim oMatches As MatchCollection

If oRegex.IsMatch(Str) = True Then

oMatches = oRegex.Matches(Str)

For Each oMatch In oMatches

Dim strTemp As String = oMatch.Value

Str = Strings.Replace(Str, strTemp, oMatch.Groups(1).Value & ".gif")

Next

End If

Return Str

End Function
tangzhong 2011-07-26
  • 打赏
  • 举报
回复
“你这个只会匹配 mudule_comment_cont 
你要匹配 mudule_comment_cont 后边的什么内容?
Dim matchColltions As MatchCollection = Regex.Matches(contentHtml, "(?is)(?<=mudule_comment_cont).*?(?=<div class=\"mod_interact_main\">)")”

我是想从mudule_comment_cont 后面开始匹配我需要的内容,提取出姓名和评论。
干脆我简化点:
<div main>非狐:非狐的留言
<div li>这是回复非狐的留言</div>

<div 2>留言2
<li 2>这是回复2</div>
我就是想,先从外面匹配出每个大的留言,然后匹配出回复。
q107770540 2011-07-26
  • 打赏
  • 举报
回复

Private Sub TestReg()
Dim strData As String
Dim reg As Object
Dim matchs As Object, match As Object

strData = "<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/51062169/home""> ?? </a>
<time datetime=""1??"">1??</time>
<a href=""javascript:;"" id=""2_link_reply_1268192440_1"" class=""link_reply"" onclick=""QOM.FP.showCommentBox('1268192440_1',2);return false;"">??</a>
<a href=""javascript:;"" id=""2_link_hide_1268192440_1"" class=""link_reply none"" onclick=""QOM.FP.hideCommentBox();return false;"">????</a>
</div>
<div class=""mudule_comment_detail""> ???????????? </div>

<ol class=""sub_comment "" id=""2_sub_comment_1268192440_1"">

<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo2.store.qq.com/qzone/51062169/51062169/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/51062169/home""> ?? </a>
<time datetime=""1??"">?? 1??</time>
</div>
<div class=""mudule_comment_detail""> ?????
</div>
</div>
</li>

<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo1.store.qq.com/qzone/405797768/405797768/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/405797768/home""> ?? </a>
<time datetime=""1??"">?? 1??</time>
</div>
<div class=""mudule_comment_detail""> ??????,?????,????<img src=""http://qzs.qq.com/qzone/em/e181.gif""/> </div>
</div>
</li>


<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo2.store.qq.com/qzone/59991021/59991021/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/59991021/home""> ?Jay_??? </a>
<time datetime=""05?01? 02:24"">05?01? 02:24</time>
<a href=""javascript:;"" id=""2_link_reply_1268192440_3"" class=""link_reply"" onclick=""QOM.FP.showCommentBox('1268192440_3',2);return false;"">??</a>
<a href=""javascript:;"" id=""2_link_hide_1268192440_3"" class=""link_reply none"" onclick=""QOM.FP.hideCommentBox();return false;"">????</a>
</div>
<div class=""mudule_comment_detail""> ?????????:(404)???
????
????? </div>

<ol class=""sub_comment "" id=""2_sub_comment_1268192440_3"">

<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo1.store.qq.com/qzone/405797768/405797768/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/405797768/home""> ?? </a>
<time datetime=""06?07? 20:22"">?? 06?07? 20:22</time>
</div>
<div class=""mudule_comment_detail""> ?????????,???????????????~ </div>
</div>
</li>

</ol>

<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo2.store.qq.com/qzone/59991021/59991021/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/59991021/home""> ?Jay_??? </a>
<time datetime=""06?07? 21:11"">06?07? 21:11</time>
<a href=""javascript:;"" id=""2_link_reply_1268192440_6"" class=""link_reply"" onclick=""QOM.FP.showCommentBox('1268192440_6',2);return false;"">??</a>
<a href=""javascript:;"" id=""2_link_hide_1268192440_6"" class=""link_reply none"" onclick=""QOM.FP.hideCommentBox();return false;"">????</a>
</div>
<div class=""mudule_comment_detail""> ?????????http://dl.dbank.com/c0ttdot198
??? </div>

<ol class=""sub_comment "" id=""2_sub_comment_1268192440_6"">

<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo1.store.qq.com/qzone/405797768/405797768/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/405797768/home""> ?? </a>
<time datetime=""06?08? 10:15"">?? 06?08? 10:15</time>
</div>
<div class=""mudule_comment_detail""> ??http://www.sbys.org.cn??????~ </div>
</div>
</li>

<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo2.store.qq.com/qzone/59991021/59991021/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/59991021/home""> ?Jay_??? </a>
<time datetime=""06?08? 10:34"">?? 06?08? 10:34</time>
</div>
<div class=""mudule_comment_detail""> ??????????????????????? </div>
</div>
</li>

<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo2.store.qq.com/qzone/59991021/59991021/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/59991021/home""> ?Jay_??? </a>
<time datetime=""06?08? 16:11"">?? 06?08? 16:11</time>
</div>
<div class=""mudule_comment_detail""> ??? ????????? </div>
</div>
</li>

<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo1.store.qq.com/qzone/405797768/405797768/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/405797768/home""> ?? </a>
<time datetime=""06?09? 09:45"">?? 06?09? 09:45</time>
</div>
<div class=""mudule_comment_detail""> ?????QQ?~ </div>
</div>
</li>
</ol>


<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo1.store.qq.com/qzone/94028028/94028028/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/94028028/home""> min'er </a>
<time datetime=""07?07? 13:40"">07?07? 13:40</time>
<a href=""javascript:;"" id=""2_link_reply_1268192440_10"" class=""link_reply"" onclick=""QOM.FP.showCommentBox('1268192440_10',2);return false;"">??</a>
<a href=""javascript:;"" id=""2_link_hide_1268192440_10"" class=""link_reply none"" onclick=""QOM.FP.hideCommentBox();return false;"">????</a>
</div>
<div class=""mudule_comment_detail""> <img src=""/qzone/em/e106.gif"">???,????????????????? </div>

<ol class=""sub_comment "" id=""2_sub_comment_1268192440_10"">

<li class=""mod_interact"">
<div class=""mod_interact_avatar""><span class=""avatar_round""></span><img src=""http://qlogo1.store.qq.com/qzone/405797768/405797768/30"" alt=""pic"" /></div>
<div class=""mod_interact_main"">
<div class=""mudule_comment_cont"">
<a href=""http://qz.qq.com/405797768/home""> ?? </a>
<time datetime=""07?08? 11:15"">?? 07?08? 11:15</time>
</div>
<div class=""mudule_comment_detail""> ?,?????,???<img src=""http://qzs.qq.com/qzone/em/e181.gif""/> </div>
</div>
</li>
</div>
</div>"

Set reg = CreateObject("vbscript.regExp")
reg.Global = True
reg.IgnoreCase = True
reg.MultiLine = True
reg.Pattern = "(?is)(?<=mudule_comment_cont).*(?=<div class=""mod_interact_main"">)"
Set matchs = reg.Execute(strData)
For Each match In matchs
'Debug.Print match.Value
Debug.Print match.SubMatches(0)
Next
End Sub
q107770540 2011-07-26
  • 打赏
  • 举报
回复
[Quote=引用 2 楼 tangzhong 的回复:]

好像不对,我想请问下Dim matchColltions As MatchCollection = Regex.Matches(contentHtml, "mudule_comment_cont")能得到cont这之后的所有内容么?
[/Quote]
你这个只会匹配 mudule_comment_cont
你要匹配 mudule_comment_cont 后边的什么内容?
Dim matchColltions As MatchCollection = Regex.Matches(contentHtml, "(?is)(?<=mudule_comment_cont).*?(?=<div class=\"mod_interact_main\">)")
tangzhong 2011-07-26
  • 打赏
  • 举报
回复
再度求助,恳请高人指点一二呀~
tangzhong 2011-07-26
  • 打赏
  • 举报
回复
还是没人来解答,结贴算了,~
tangzhong 2011-07-25
  • 打赏
  • 举报
回复
好像不对,我想请问下Dim matchColltions As MatchCollection = Regex.Matches(contentHtml, "mudule_comment_cont")能得到cont这之后的所有内容么?
鸭梨山大帝 2011-07-25
  • 打赏
  • 举报
回复
抓取名字: (?<=<a href="http://qz.qq.com/\d+/home">).*(?=</a>)
抓取评论: (?<=<div class="mudule_comment_detail">).*(?=</div>)
没有找到[img]love[/img]类似的内容.

16,553

社区成员

发帖
与我相关
我的任务
社区描述
VB技术相关讨论,主要为经典vb,即VB6.0
社区管理员
  • VB.NET
  • 水哥阿乐
  • 无·法
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧