求正则表达式匹配html里面的table数据

NetAnt007 2011-10-07 11:46:50
html如下:
我想取出各个货币的名称,及其各个报价数据,不要表头。只要货币名称和数据即可。
<table cellspacing="0" cellpadding="0" width="100%">
<tr>
<td align="center">
<table cellspacing="0" cellpadding="0" border="0" style="width: 100%; border-collapse: collapse; word-break: break-all; word-wrap: break-word">
<tr>
<td valign="bottom" colspan="100">
<table cellspacing="0" cellpadding="0" border="0" style="width: 100%; border-collapse: collapse;word-break: break-all; word-wrap: break-word">
<tr style="height: 30px;">
<td class="tdCommonTableHeader" align="left" valign="bottom" style="width: 50%;">日期:2011年09月30日 星期五</td>
<td class="tdCommonTableHeader" align="right" valign="bottom" colspan="100" style="width: 50%;">单位:人民币/100外币
</td>
</tr>
</table>
</td>
</tr>
<tr>
<td valign="top" colspan="100">
<table class="tableDataTable" cellspacing="0" cellpadding="0" border="0" style="border-style: Solid;width: 100%; border-collapse: collapse; word-break: break-all; word-wrap: break-word">
<tr style="background-color: #E8E8E8; font-weight: bold; height: 30px;">
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 28%;">币种</td>
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 16%;">汇买、汇卖<br>中间价</td>
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 14%;">现汇买入价</td>
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 14%;">现钞买入价</td>
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 14%;">卖出价</td>
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 14%;">发布时间</td>
</tr>
<tr style="height: 20px;">
<td class="tdCommonTableData" align="left" valign="middle" style="width: 28%;">美元(USD)</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 16%;">638.23</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">636.83</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">631.72</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">639.38</td>
<td class="tdCommonTableData" align="center" valign="middle" style="width: 14%;">17:01:50</td>
</tr>
<tr style="height: 20px;">
<td class="tdCommonTableData" align="left" valign="middle" style="width: 28%;">港币(HKD)</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 16%;">82.01</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">81.83</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">81.17</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">82.16</td>
<td class="tdCommonTableData" align="center" valign="middle" style="width: 14%;">23:15:26</td>
</tr>
<tr style="height: 20px;">
<td class="tdCommonTableData" align="left" valign="middle" style="width: 28%;">日元(JPY)</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 16%;">8.3114</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">8.2765</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">8.0105</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">8.3430</td>
<td class="tdCommonTableData" align="center" valign="middle" style="width: 14%;">23:15:26</td>
</tr>
<tr style="height: 20px;">
<td class="tdCommonTableData" align="left" valign="middle" style="width: 28%;">欧元(EUR)</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 16%;">860.59</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">856.98 </td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">829.44</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">863.86</td>
<td class="tdCommonTableData" align="center" valign="middle" style="width: 14%;">23:15:26</td>
</tr>
<tr style="height: 20px;">
<td class="tdCommonTableData" align="left" valign="middle" style="width: 28%;">英镑(GBP)</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 16%;">996.47</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">992.28</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">960.40</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">1000.26</td>
<td class="tdCommonTableData" align="center" valign="middle" style="width: 14%;">23:15:26</td>
</tr>
</table>
</td>
</tr>
<tr style="height: 30px;">
<td class="tdCommonTableFooter" align="right" valign="middle" colspan="100">备注:此汇率为我行初始报价,成交价以各地分行实际交易汇率为准</td>
</tr>
<tr style="height: 30px;"></tr>
</table>
</td>
</tr>
<tr>
<td align="center">
<input type="submit" name="refurbish" value=" 刷新 " id="refurbish" />
</td>
</tr>
</table>
...全文
608 5 打赏 收藏 转发到动态 举报
写回复
用AI写文章
5 条回复
切换为时间正序
请发表友善的回复…
发表回复
xingxingxiaofei 2011-11-20
  • 打赏
  • 举报
回复
dsfafadsfasdfadsfadsfasd
cjh200102 2011-10-08
  • 打赏
  • 举报
回复
楼上不错
huangwenquan123 2011-10-08
  • 打赏
  • 举报
回复

string str = File.ReadAllText(@"E:\1.txt", Encoding.GetEncoding("gb2312"));
Regex reg = new Regex(@"(?is)(?<=<table[^>]*?class=""tableDataTable""[^>]*?>(?:(?!</?table).)*)<tr[^>]*?>.*?</tr>(?:\s*<tr[^>]*?>(?:\s*<td[^>]*?>(.*?)</td>)*\s*</tr>)*");
foreach (Match m in reg.Matches(str))
foreach (Capture c in m.Groups[1].Captures)
Console.WriteLine(c.Value);
/*
美元(USD)
638.23
636.83
631.72
639.38
17:01:50
港币(HKD)
82.01
81.83
81.17
82.16
23:15:26
日元(JPY)
8.3114
8.2765
8.0105
8.3430
23:15:26
欧元(EUR)
860.59
856.98
829.44
863.86
23:15:26
英镑(GBP)
996.47
992.28
960.40
1000.26
23:15:26
*/
jshi123 2011-10-08
  • 打赏
  • 举报
回复

string s = "......"; // html string
string pattern = @"(?is)<table[^>]+?class=""tableDataTable""[^>]*>\s*(<tr.*?>.+?</tr>\s*)+</table>";
var list = Regex.Match(s, pattern).Groups[1].Captures.Cast<Capture>().Skip(1).Select(c =>
{
var td = Regex.Matches(c.Value, "<td.*?>(.*?)</td>")
.Cast<Match>().Select(m => m.Groups[1].Value).ToArray();
return new
{
币种 = td[0],
中间价 = td[1],
现汇买入价 = td[2],
现钞买入价 = td[3],
卖出价 = td[4],
发布时间 = td[5]
};
}).ToList();
NetAnt007 2011-10-07
  • 打赏
  • 举报
回复
上面的代码有点乱,我整理后,比较清晰些,请参考:
<table cellspacing="0" cellpadding="0" width="100%">
<tr>
<td align="center">
<table cellspacing="0" cellpadding="0" border="0" style="width: 100%; border-collapse: collapse; word-break: break-all; word-wrap: break-word">
<tr>
<td valign="bottom" colspan="100">
<table cellspacing="0" cellpadding="0" border="0" style="width: 100%; border-collapse: collapse;word-break: break-all; word-wrap: break-word">
<tr style="height: 30px;">
<td class="tdCommonTableHeader" align="left" valign="bottom" style="width: 50%;">日期:2011年09月30日 星期五</td>
<td class="tdCommonTableHeader" align="right" valign="bottom" colspan="100" style="width: 50%;">单位:人民币/100外币
</td>
</tr>
</table>
</td>
</tr>
<tr>
<td valign="top" colspan="100">
<table class="tableDataTable" cellspacing="0" cellpadding="0" border="0" style="border-style: Solid;width: 100%; border-collapse: collapse; word-break: break-all; word-wrap: break-word">
<tr style="background-color: #E8E8E8; font-weight: bold; height: 30px;">
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 28%;">币种</td>
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 16%;">汇买、汇卖<br>中间价</td>
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 14%;">现汇买入价</td>
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 14%;">现钞买入价</td>
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 14%;">卖出价</td>
<td class="tdCommonTableHead" align="center" valign="middle" style="width: 14%;">发布时间</td>
</tr>
<tr style="height: 20px;">
<td class="tdCommonTableData" align="left" valign="middle" style="width: 28%;">美元(USD)</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 16%;">638.23</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">636.83</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">631.72</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">639.38</td>
<td class="tdCommonTableData" align="center" valign="middle" style="width: 14%;">17:01:50</td>
</tr>
<tr style="height: 20px;">
<td class="tdCommonTableData" align="left" valign="middle" style="width: 28%;">港币(HKD)</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 16%;">82.01</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">81.83</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">81.17</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">82.16</td>
<td class="tdCommonTableData" align="center" valign="middle" style="width: 14%;">23:15:26</td>
</tr>
<tr style="height: 20px;">
<td class="tdCommonTableData" align="left" valign="middle" style="width: 28%;">日元(JPY)</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 16%;">8.3114</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">8.2765</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">8.0105</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">8.3430</td>
<td class="tdCommonTableData" align="center" valign="middle" style="width: 14%;">23:15:26</td>
</tr>
<tr style="height: 20px;">
<td class="tdCommonTableData" align="left" valign="middle" style="width: 28%;">欧元(EUR)</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 16%;">860.59</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">856.98 </td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">829.44</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">863.86</td>
<td class="tdCommonTableData" align="center" valign="middle" style="width: 14%;">23:15:26</td>
</tr>
<tr style="height: 20px;">
<td class="tdCommonTableData" align="left" valign="middle" style="width: 28%;">英镑(GBP)</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 16%;">996.47</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">992.28</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">960.40</td>
<td class="tdCommonTableData" align="right" valign="middle" style="width: 14%;">1000.26</td>
<td class="tdCommonTableData" align="center" valign="middle" style="width: 14%;">23:15:26</td>
</tr>
</table>
</td>
</tr>
<tr style="height: 30px;">
<td class="tdCommonTableFooter" align="right" valign="middle" colspan="100">备注:此汇率为我行初始报价,成交价以各地分行实际交易汇率为准</td>
</tr>
<tr style="height: 30px;"></tr>
</table>
</td>
</tr>
<tr>
<td align="center">
<input type="submit" name="refurbish" value=" 刷新 " id="refurbish" />
</td>
</tr>
</table>

62,046

社区成员

发帖
与我相关
我的任务
社区描述
.NET技术交流专区
javascript云原生 企业社区
社区管理员
  • ASP.NET
  • .Net开发者社区
  • R小R
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告

.NET 社区是一个围绕开源 .NET 的开放、热情、创新、包容的技术社区。社区致力于为广大 .NET 爱好者提供一个良好的知识共享、协同互助的 .NET 技术交流环境。我们尊重不同意见,支持健康理性的辩论和互动,反对歧视和攻击。

希望和大家一起共同营造一个活跃、友好的社区氛围。

试试用AI创作助手写篇文章吧