Regex.Replace(如何写正则限制只保留html文件中的中文以及数字？）

程序可以让尸体动起来 2006-07-17 11:15:31

用了孟子的
content = Regex.Replace(content,"<[^>]*>", "");
//替换空格
content = Regex.Replace(content,"\\s+", " ");
怎么不管用？？

-------------------------------------------------------------------------
自己写个麻烦的：

content =content.Replace ("宋体","");
string temp =Regex.Replace(content,@"[^\x00-\xff]","").ToString();　　//取出　非双字节字符

char[] strarr = temp.ToCharArray();

for (int i = 0 ; i < strarr.Length ;i++)
{
content = content.Replace(strarr[i].ToString(),"");　　　 //将非双字节字符全部替换掉
}

如果照上面的写，那么原来的单字节字符，比如数字就被替换掉了。

那位大哥有个办法，保留html文件里的文字以及数字？在线等，解决立刻揭贴。谢谢

...全文

557 4 打赏收藏转发到动态举报

写回复

用AI写文章

4 条回复

切换为时间正序

请发表友善的回复…

发表回复

程序可以让尸体动起来 2006-07-17

打赏
举报

再顶下，还有什么其他的办法吗？

winner2050 2006-07-17

打赏
举报

你是要删除HTML标记还是单纯的“html文件中的中文以及数字”

是如果html文件中的中文以及数字
那么用最基础的知识就得了。
content=content.ToLower();
content =content.Replace ("a","");
content =content.Replace ("b","");
...
content =content.Replace ("Z","");
content =content.Replace (".","");
content =content.Replace ("?","");

不然
/// <summary>
/// 去除HTML标记
/// </summary>
/// <param name="NoHTML">包括HTML的源码 </param>
/// <returns>已经去除后的文字</returns>
public static string NoHTML(string Htmlstring)
{
//删除脚本
Htmlstring = Regex.Replace(Htmlstring,@"<script[^>]*?>.*?</script>","",RegexOptions.IgnoreCase);
//删除HTML
Htmlstring = Regex.Replace(Htmlstring,@"<(.[^>]*)>","",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"([\r\n])[\s]+","",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"-->","",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"<!--.*","",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"&(quot|#34);","\"",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"&(amp|#38);","&",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"&(lt|#60);","<",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"&(gt|#62);",">",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"&(nbsp|#160);"," ",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"&(iexcl|#161);","\xa1",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"&(cent|#162);","\xa2",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"&(pound|#163);","\xa3",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"&(copy|#169);","\xa9",RegexOptions.IgnoreCase);
Htmlstring = Regex.Replace(Htmlstring,@"&#(\d+);","",RegexOptions.IgnoreCase);

Htmlstring.Replace("<","");
Htmlstring.Replace(">","");
Htmlstring.Replace("\r\n","");
Htmlstring=HttpContext.Current.Server.HtmlEncode(Htmlstring).Trim();

return Htmlstring;
}

还有不要看孟子E章了，里面得文章很多都是不能用得，或者漏了关键细节

程序可以让尸体动起来 2006-07-17