import re
cn=lambda x:x.decode("u8")
a=re.findall("\(.*\)",cn("5不去(jmj),了(j大家m你)不"))
for i in a:
print i
运行结果:(jmj),了(j大家m你)
我知道这是正则匹配的懒惰性造成的,那么该怎么修改来避免这一点
从而得到如下的运行结果:
(jmj)
(j大家m你)
...全文
1515打赏收藏
如何解决python正则匹配的懒惰性问题
import re cn=lambda x:x.decode("u8") a=re.findall("\(.*\)",cn("5不去(jmj),了(j大家m你)不")) for i in a: print i 运行结果:(jmj),了(j大家m你) 我知道这是正则匹配的懒惰性造成的,那么该怎么修改来避免这一点 从而得到如下的运行结果: (jmj) (j大家m你)
*?, +?, ??
The "*", "+", and "?" qualifiers are all greedy; they match as much text as possible. Sometimes this behaviour isn't desired; if the RE <.*> is matched against '<H1>title</H1>', it will match the entire string, and not just '<H1>'. Adding "?" after the qualifier makes it perform the match in non-greedy or minimal fashion; as few characters as possible will be matched. Using .*? in the previous expression will match only '<H1>'.