正则匹配的大问题

whlib 2011-06-08 11:32:31
再接刚才那个正则小问题:


现在要匹配一串长字符中的[...]中的内容,但[...]只是包含在以C1开头和RP结尾的子字符串中,并将其替换为空字符。

以C1开头和RP结尾的子字符串会出现多次。

如下:

C1 [Pillai, Vijayamohanan K.] CECRI Chennai Ctr, Taramanii 600113, India.[Kannan, Ramaiyan; Kagalwala, Husain N.; Chaudhari, Harshal D.; Kharul, Ulhas K.; Kurungot, Sreekumar] Natl Chem Lab, Div Phys Chem, Pune 411008, Maharashtra, India. RP 863 Program [2009AA0-SZ423]; National Science Foundation of China PBI membrane (530 mW cm(-2)).C1 [Suryani; Chang, Chia-Ming; Liu, Ying-Ling] Chung Yuan Christian Univ, Dept Chem Engn, Tao Yuan 32023, Taiwan.[Suryani; Chang, Chia-Ming; Liu, Ying-Ling] Chung Yuan Christian Univ, R&D Ctr Membrane Technol, Tao Yuan 32023, Taiwan.[Lee, Young Moo] Hanyang Univ, WCU Dept Energy Engn, Coll Engn, Seoul 133791, South Korea. RP

转变成:
C1 CECRI Chennai Ctr, Taramanii 600113, India. Natl Chem Lab, Div Phys Chem, Pune 411008, Maharashtra, India. RP 863 Program [2009AA0-SZ423]; National Science Foundation of China PBI membrane (530 mW cm(-2)).C1 Chung Yuan Christian Univ, Dept Chem Engn, Tao Yuan 32023, Taiwan. Chung Yuan Christian Univ, R&D Ctr Membrane Technol, Tao Yuan 32023, Taiwan. Hanyang Univ, WCU Dept Energy Engn, Coll Engn, Seoul 133791, South Korea. RP

这是本人初步写的,效果不理想,在C1和RP间只匹配了一个[...]中的内容。


<%
Dim txtStr,oStr
oStr = "C1 [Pillai, Vijayamohanan K.] CECRI Chennai Ctr, Taramanii 600113, India.[Kannan, Ramaiyan; Kagalwala, Husain N.; Chaudhari, Harshal D.; Kharul, Ulhas K.; Kurungot, Sreekumar] Natl Chem Lab, Div Phys Chem, Pune 411008, Maharashtra, India. RP"

txtStr = filterStr(oStr)
Response.Write "transform str:<br />"& oStr & "<br />"
Response.Write "transformed str:<br /><font color=red>"& txtStr &"</font>"
Response.End

Function filterStr(txt)
Set re = New regExp
re.pattern = "(^C1\s)(?:\[.*?\])+(.*?)(\sRP$)"

re.global = True
re.IgnoreCase = True
re.MultiLine = True
filterStr = re.Replace(txt,"$1$2$3")
End Function
%>



还请大家不吝赐教。
...全文
119 9 打赏 收藏 转发到动态 举报
写回复
用AI写文章
9 条回复
切换为时间正序
请发表友善的回复…
发表回复
whlib 2011-06-15
  • 打赏
  • 举报
回复
hookee那个和我后来拆分做两次匹配写的很像,不过还可以优化下。
aspwebchh的也能达到效果,不过跟我要的完整效果还有些差异。

谢谢大家。结贴。
vvjblimxz 2011-06-09
  • 打赏
  • 举报
回复
路过,学习
whlib 2011-06-09
  • 打赏
  • 举报
回复
谢谢大家,我仔细研究下,然后结贴。
挨踢直男 2011-06-08
  • 打赏
  • 举报
回复
\[.[^\]]+\](?=(.(?!C1))+RP)

不好意思 上面那个正则估计有点问题 这个试试行不行
挨踢直男 2011-06-08
  • 打赏
  • 举报
回复
Dim txtStr,oStr
oStr = "C1 [Pillai, Vijayamohanan K.] CECRI Chennai Ctr, Taramanii 600113, India.[Kannan, Ramaiyan; Kagalwala, Husain N.; Chaudhari, Harshal D.; Kharul,

Ulhas K.; Kurungot, Sreekumar] Natl Chem Lab, Div Phys Chem, Pune 411008, Maharashtra, India. RP 863 Program [2009AA0-SZ423]; National Science Foundation of

China PBI membrane (530 mW cm(-2)).C1 [Suryani; Chang, Chia-Ming; Liu, Ying-Ling] Chung Yuan Christian Univ, Dept Chem Engn, Tao Yuan 32023, Taiwan.[Suryani;

Chang, Chia-Ming; Liu, Ying-Ling] Chung Yuan Christian Univ, R&D Ctr Membrane Technol, Tao Yuan 32023, Taiwan.[Lee, Young Moo] Hanyang Univ, WCU Dept Energy

Engn, Coll Engn, Seoul 133791, South Korea. RP"

txtStr = filterStr(oStr)
Response.Write "transform str:<br />"& oStr & "<br />"
Response.Write "transformed str:<br /><font color=red>"& txtStr &"</font>"
Response.End

Function filterStr(txt)
Set re = New regExp
re.pattern = "\[.[^\]]+\](?![^\[\]]+C1)"

re.global = True
re.IgnoreCase = True
re.MultiLine = True
filterStr = re.Replace(txt,"")
End Function

这样看看合不合你的要求?
hookee 2011-06-08
  • 打赏
  • 举报
回复

<%
Dim txtStr,oStr
oStr = "C1 [Pillai, Vijayamohanan K.] CECRI Chennai Ctr, Taramanii 600113, India.[Kannan, Ramaiyan; Kagalwala, Husain N.; Chaudhari, Harshal D.; Kharul, Ulhas K.; Kurungot, Sreekumar] Natl Chem Lab, Div Phys Chem, Pune 411008, Maharashtra, India. RP 863 Program [2009AA0-SZ423]; National Science Foundation of China PBI membrane (530 mW cm(-2)).C1 [Suryani; Chang, Chia-Ming; Liu, Ying-Ling] Chung Yuan Christian Univ, Dept Chem Engn, Tao Yuan 32023, Taiwan.[Suryani; Chang, Chia-Ming; Liu, Ying-Ling] Chung Yuan Christian Univ, R&D Ctr Membrane Technol, Tao Yuan 32023, Taiwan.[Lee, Young Moo] Hanyang Univ, WCU Dept Energy Engn, Coll Engn, Seoul 133791, South Korea. RP"

txtStr = filterStr(oStr)
Response.Write "transform str:<br />"& oStr & "<br />"
Response.Write "transformed str:<br /><font color=red>"& txtStr &"</font>"
Response.End

Function filterStr(ByVal txt)
Set re = New regExp
re.pattern = "C1\s([\s\S]+?)\sRP"
re.global = True
re.IgnoreCase = False
re.MultiLine = True
Set col = re.Execute(txt)
For Each m In col
Set re1 = New regExp
re1.Global = True
re1.IgnoreCase = False
re1.MultiLine = True
re1.Pattern = "\[[^\]]+\]"
t = re1.Replace(m, "")
txt = Replace(txt, m, t)
Set re1 = Nothing
Next
Set re = Nothing
filterStr = txt
End Function
%>
whlib 2011-06-08
  • 打赏
  • 举报
回复
[Quote=引用 2 楼 fengyun817 的回复:]
ASP不熟悉,你试下看看吧:

VBScript code


Function filterStr(txt)
Set re = New regExp
re.pattern = "(C1\s+)(([^\[]*?)\[.*?\]([^\]]*?))+(.*?\s+RP)"

re.global = True
re.IgnoreCase = Tru……
[/Quote]


还是有点问题,只匹配了第二个C1的第二个[]后的字符串。

匹配结果如下:
C1 Chung Yuan Christian Univ, R&D Ctr Membrane Technol, Tao Yuan 32023, Taiwan. Hanyang Univ, WCU Dept Energy Engn, Coll Engn, Seoul 133791, South Korea. RP
fengyun817 2011-06-08
  • 打赏
  • 举报
回复
ASP不熟悉,你试下看看吧:


Function filterStr(txt)
Set re = New regExp
re.pattern = "(C1\s+)(([^\[]*?)\[.*?\]([^\]]*?))+(.*?\s+RP)"

re.global = True
re.IgnoreCase = True
re.MultiLine = True
filterStr = re.Replace(txt,"$1$3$4$5")
End Function
whlib 2011-06-08
  • 打赏
  • 举报
回复
顶一下

28,391

社区成员

发帖
与我相关
我的任务
社区描述
ASP即Active Server Pages,是Microsoft公司开发的服务器端脚本环境。
社区管理员
  • ASP
  • 无·法
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧