又又正表达式问题
HTML 内容如下:
<div id="gml"><div><img src="http://down.sjxyx.com/images/sjxyx7/1/831417152423.gif" width="100" height="75" alt="蜘蛛侠:剧毒之城(Spider-Man Toxic City)高清版" /></div> <a href="http://down.sjxyx.com/sjxyx7/1/832349562122.cab" ......
我使用此表达式 regex = "<div id=\"gml\"><div><img src=\"(.*?)\" width=\"\\d+\" height=\"\\d+\" alt=\"[\u4e00-\u9fa5\ufe30-\uffa00-9\\-]+\"? /></div>";
捕获group(1)结果如下http://down.sjxyx.com/images/sjxyx7/1/831417152423.gif" width="100" height="75" alt="蜘蛛侠:剧毒之城(Spider-Man Toxic City)高清版" /></div> <a href="http://down.sjxyx.com/sjxyx7/1/832349562122.cab.....
然而我用regex = "<div id=\"gml\"><div><img src=\"(.*?)\".*?alt=\"[\u4e00-\u9fa5\ufe30-\uffa00-9\\-]+\"? /></div>";
group(1)得到我期望的结果http://down.sjxyx.com/images/sjxyx7/1/831417152423.gif
请问为什么会这样。。。
——————————————————————————————————————————————————————————————
还有
只有这样写String regex = "alt=\"[^\"|[\u4e00-\u9fa5\ufe30-\uffa00-9\\-]]+\"";才可以得到结果
这两条--String regex = "alt=[\u4e00-\u9fa5\ufe30-\uffa00-9\\-]+";
|-String regex = "alt=\"[\u4e00-\u9fa5\ufe30-\uffa00-9\\-]+\"";是错的
请问这是为什么?
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher("alt=\"蜘蛛侠:剧毒之城(Spider-Man Toxic City)\"");
while(matcher.find())
System.out.println(matcher.group());