62,047
社区成员
发帖
与我相关
我的任务
分享
<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
<title></title>
</head>
<body>
<form id="form1" runat="server">
<div>
<section class="main-sort" id="order">
<div class="type-sort fn-left">
<span>排序:</span><a href="javascript:;" data-seed="sort-tj" class="sort-tj cur"><span class="sort-tjIcon">推荐</span></a>
<a href="javascript:;" data-seed="sort-price" class="sort-price" data-price="0"><span class="sort-priceIcon">价格</span></a>
<input type="checkbox" class="J_sort J_sortCheck" data-seed="sort-yh" data-item="sale" data-val=""><label>优惠促销</label>
<input type="checkbox" class="J_sort J_sortCheck" data-seed="sort-bzz" data-item="halftrip" data-val=""><label>半自助</label>
</div>
<div class="common-page fn-right">
<span class="j-pageCurrent">1</span>/<span class="j-pageAll">1</span>
<a href="javascript:;" class="page-prev no-page">上一页</a>
<a href="javascript:;" class="page-next no-page">下一页</a>
</div>
</section>
<section class="listbox">
<div class="ListLoding" style="display: none;">数据正在加载。。。</div>
<ul class="listul J_pagelist">
<li data-info="[12]" data-days="6天" data-to="尊爵贵族线" data-halftrip="0" data-tag="尊爵贵族线" data-price="6466" data-chunk="line-SHS67909" class="lineitem cfix" style="display: list-item;">
<div class="img fn-left">
<a title="【纯净游】轻松台湾纯玩西线6日游" target="_blank" href="http://sh.uzai.com/tour-67909.html"><img width="125px" height="67px" alt="" data-img="http://r.uzaicdn.com/pic/15434/m/w160/h120/t1" src="http://r.uzaicdn.com/pic/15434/m/w160/h120/t1"></a>
<div class="prd-num">产品编号:SHS67909</div>
</div>
<dl class="info fn-left">
<dt class="t">
<a title="【纯净游】轻松台湾纯玩西线6日游" target="_blank" href="http://sh.uzai.com/tour-67909.html">【纯净游】轻松台湾纯玩西线6日游</a>
</dt>
<dd class="desc">台北连住二晚市区5花酒店、尽享台北市区半天的自由活动,随心所欲安排您的行程</dd>
<dd class="moredesc">
<span>满意度:<span class="n">96%</span></span>
<span class="pin"><span class="n">6</span>人点评</span>
<span>最近出发班期:<span class="n">12/22</span></span>
<a class="date" onclick="mychecks(67909,this,'http://sh.uzai.com/tour-67909.html')" href="javascript:;">全部班期</a>
</dd>
</dl>
<div class="detail fn-right">
<p class="price"><span class="u">¥</span><span class="n">6466</span>起</p>
<span data-di="50" rel="J_popDisong" class="d J_powerFloat"><em class="dsnum">50</em></span>
<span data-song="200" rel="J_popDisong" class="s m-5 J_powerFloat"><em class="dsnum">200</em></span>
</div>
</li>
<li data-info="[12][1]" data-days="8天" data-to="尊爵贵族线" data-halftrip="0" data-tag="尊爵贵族线" data-price="7366" data-chunk="line-SHS67808" class="lineitem cfix" style="display: list-item;">
<div class="img fn-left">
<a title="【发现美味】台湾纯玩环岛8日(国航往返)" target="_blank" href="http://sh.uzai.com/tour-67808.html"><img width="125px" height="67px" alt="" data-img="http://r.uzaicdn.com/pic/15451/m/w160/h120/t1" src="http://r.uzaicdn.com/pic/15451/m/w160/h120/t1"></a>
<div class="prd-num">产品编号:SHS67808</div>
</div>
<dl class="info fn-left">
<dt class="t">
<a title="【发现美味】台湾纯玩环岛8日(国航往返)" target="_blank" href="http://sh.uzai.com/tour-67808.html">【发现美味】台湾纯玩环岛8日(国航往返)</a>
</dt>
<dd class="desc">尝特色美食、台北连住二晚市区酒店、台南一晚升级五星香格里拉大饭店</dd>
<dd class="moredesc">
<span>满意度:<span class="n">96%</span></span>
<span class="pin"><span class="n">6</span>人点评</span>
<span>最近出发班期:<span class="n">12/29、1/5</span></span>
<a class="date" onclick="mychecks(67808,this,'http://sh.uzai.com/tour-67808.html')" href="javascript:;">全部班期</a>
</dd>
</dl>
<div class="detail fn-right">
<p class="price"><span class="u">¥</span><span class="n">7366</span>起</p>
<span data-di="50" rel="J_popDisong" class="d J_powerFloat"><em class="dsnum">50</em></span>
<span data-song="200" rel="J_popDisong" class="s m-5 J_powerFloat"><em class="dsnum">200</em></span>
</div>
</li>
</ul>
<div class="noshuju" style="display: none;">
<span>对不起,没有找到符合条件的产品!</span><a href="javascript:;">重新筛选</a>
</div>
</section>
</div>
</form>
</body>
</html>
void Test3()
{
string s = File.ReadAllText(@"E:\test\网页抓取测试\3.txt", Encoding.GetEncoding("gb2312"));
int result = Regex.Matches(s, "<li data-info=\"\\[").Count;
Console.WriteLine(result);
}
[/quote]
总有不同部分的,用不同部分去区分就行了。
如果按你说的页面有多个 <ul class="listul J_pagelist">呢?那你也不知道要取哪个ul下的li啊[/quote]
<ul class="listul J_pagelist">这个样式,在整个网页出现一次,也是唯一标识,如果正则加这个怎么写。[/quote]
如果你一定要加ul只能他两条写
string html = Regex.Match(s, "(?is)<ul class=\"listul J_pagelist\">.*?</ul>").Value;
Console.WriteLine(Regex.Matches(html, "<li data-info=\"\\[").Count);
void Test3()
{
string s = File.ReadAllText(@"E:\test\网页抓取测试\3.txt", Encoding.GetEncoding("gb2312"));
int result = Regex.Matches(s, "<li data-info=\"\\[").Count;
Console.WriteLine(result);
}
[/quote]
总有不同部分的,用不同部分去区分就行了。
如果按你说的页面有多个 <ul class="listul J_pagelist">呢?那你也不知道要取哪个ul下的li啊[/quote]
<ul class="listul J_pagelist">这个样式,在整个网页出现一次,也是唯一标识,如果正则加这个怎么写。
void Test3()
{
string s = File.ReadAllText(@"E:\test\网页抓取测试\3.txt", Encoding.GetEncoding("gb2312"));
int result = Regex.Matches(s, "<li data-info=\"\\[").Count;
Console.WriteLine(result);
}
[/quote]
总有不同部分的,用不同部分去区分就行了。
如果按你说的页面有多个 <ul class="listul J_pagelist">呢?那你也不知道要取哪个ul下的li啊
void Test3()
{
string s = File.ReadAllText(@"E:\test\网页抓取测试\3.txt", Encoding.GetEncoding("gb2312"));
int result = Regex.Matches(s, "<li data-info=\"\\[").Count;
Console.WriteLine(result);
}