如何用正则表达式获取下列代码中的链接

oZhangZhiCheng 2018-02-06 04:52:37

<a data-ng-bind-html="doc.headline" data-ng-href="https://www.washingtonpost.com/business/on-small-business/hondajet-sees-china-southeast-asia-demand-spurred-by-efficiency/2018/02/06/1a70e2a4-0b03-11e8-998c-96deb18cca19_story.html" target="_self" data-ng-mousedown="vm.sendData(this)" class="ng-binding" href="https://www.washingtonpost.com/business/on-small-business/hondajet-sees-china-southeast-asia-demand-spurred-by-efficiency/2018/02/06/1a70e2a4-0b03-11e8-998c-96deb18cca19_story.html">HondaJet Sees <strong>China</strong>, Southeast Asia Demand Spurred by Efficiency</a>

我要获取href后边的超链接,请问正则表达式怎么写?我尝试了多次,在regex101网站上正则表达式没问题,但是在Pycharm里边运行之后没有返回结果。
...全文
766 2 打赏 收藏 转发到动态 举报
写回复
用AI写文章
2 条回复
切换为时间正序
请发表友善的回复…
发表回复
oyljerry 2018-02-06
  • 打赏
  • 举报
回复
直接lxml获取节点对应属性的值
陈年椰子 2018-02-06
  • 打赏
  • 举报
回复
# coding=UTF-8
import re
html = '''
<a data-ng-bind-html="doc.headline" data-ng-href="https://www.washingtonpost.com/business/on-small-business/hondajet-sees-china-southeast-asia-demand-spurred-by-efficiency/2018/02/06/1a70e2a4-0b03-11e8-998c-96deb18cca19_story.html" target="_self" data-ng-mousedown="vm.sendData(this)" class="ng-binding" href="https://www.washingtonpost.com/business/on-small-business/hondajet-sees-china-southeast-asia-demand-spurred-by-efficiency/2018/02/06/1a70e2a4-0b03-11e8-998c-96deb18cca19_story.html">HondaJet Sees <strong>China</strong>, Southeast Asia Demand Spurred by Efficiency</a>
'''

pattern=re.compile(r' href="(.*?)"',re.S)

items = re.findall(pattern, html)
for item in items:
    print item

37,719

社区成员

发帖
与我相关
我的任务
社区描述
JavaScript,VBScript,AngleScript,ActionScript,Shell,Perl,Ruby,Lua,Tcl,Scala,MaxScript 等脚本语言交流。
社区管理员
  • 脚本语言(Perl/Python)社区
  • IT.BOB
加入社区
  • 近7日
  • 近30日
  • 至今

试试用AI创作助手写篇文章吧