网页抓取信息,加字符编码转换
情况是这样的:从别人网站上抓取股票交易信息,结果网站上代码如下
<html>
<body>
<table id=historical_price class=gf-table>
<tr class=bb>
<th class="bb lm">日期
<th class="rgt bb">开盘价 <th class="rgt bb">最高价 <th class="rgt bb">最低价
<th class="rgt bb">收盘价
<th class="rgt bb rm">成交量
</tr>
<tr>
<td class="lm">2010-04-23
<td class="rgt">18.18
<td class="rgt">18.50
<td class="rgt">17.99
<td class="rgt">18.18
<td class="rgt rm">4,317,567
</tr>
<tr>
<td class="lm">2010-04-22<td>
<td class="rgt">18.11
<td class="rgt">18.11
<td class="rgt">18.11
<td class="rgt">18.11
<td class="rgt rm">0
</tr>
<tr>
<td class="lm">2010-04-21
<td class="rgt">17.60
<td class="rgt">18.25
<td class="rgt">17.60
<td class="rgt">18.11
<td class="rgt rm">4,993,918
</table>
</body></html>
想把日期这种转换成汉字:日期,并且把有关数据提取出来,哪位大大能帮帮忙,关键它也不符合xml