js正则表达式剔除重复的单词出错,什么原因?

wangzhichao666 2012-03-07 08:30:02
以下是我按照《javascript语言精粹》一书中的代码写的demo,我想做的事情是历遍文本,把文本中出现的所有单词以只出现一次的形式打印出来,但是好像没达到我的要求,请教了。
注:我要的效果是:<div id="test2">activity Sizzle It! is the expert in producing reels that capture your message and captivate audience — all with creativity style.</div>

<!DOCTYPE HTML>
<html lang="en-US">
<head>
<meta charset="UTF-8">
<title></title>

<style type="text/css">
#test1,#test2{width:500px;height:200px;border:1px solid #00f;margin-bottom:20px;}
</style>
<script type="text/javascript">
window.onload = function () {
var test1 = document.getElementById('test1'),
test2 = document.getElementById('test2'),
textSource = test1.innerHTML,
textEscape;
var textRegExp = /([A-Za-z\u00C0-\u1FFF\u2800-\uFFFD'\-]+)\s+\1/g;//定义一个重复的单词
textEscape = textSource.replace(textRegExp,"$1");
test2.innerHTML = textEscape;

}
</script>
</head>
<body>
<div id="test1">activity Sizzle It! is is the expert in producing sizzle reels that capture your message and captivate your audience — all with creativity and style. expert sizzle reels that capture</div>
<div id="test2"></div>
</body>
</html>
...全文
205 7 打赏 收藏 转发到动态 举报
写回复
用AI写文章
7 条回复
切换为时间正序
请发表友善的回复…
发表回复
峭沙 2012-03-07
  • 打赏
  • 举报
回复
js不支持反向预查,所以要删除后面的重复项需要把字符串先反转再进行处理,处理完再反转回来,暂时只想到这个方法,不知道有没有更好的解决办法
峭沙 2012-03-07
  • 打赏
  • 举报
回复
<!DOCTYPE HTML>
<html lang="en-US">
<head>
<meta charset="UTF-8">
<title></title>

<style type="text/css">
#test1,#test2{width:500px;height:200px;border:1px solid #00f;margin-bottom:20px;}
</style>
<script type="text/javascript">
window.onload = function () {
var test1 = document.getElementById('test1'),
test2 = document.getElementById('test2'),
textSource = test1.innerHTML,
textEscape;
textSource = textSource.split('').reverse().join('');
var textRegExp = /(\b[A-Za-z\u00C0-\u1FFF\u2800-\uFFFD\'\-]+)(?=\b.*\1)/ig;
textEscape = textSource.replace(textRegExp,"").split('').reverse().join('');
test2.innerHTML = textEscape;
}
</script>
</head>
<body>
<div id="test1">activity Sizzle It! is is the expert in producing sizzle reels that capture your message and captivate your audience — all with creativity and style. expert sizzle reels that capture</div>
<div id="test2"></div>
</body>
</html>
zhangqinhappy 2012-03-07
  • 打赏
  • 举报
回复
就知道你正则表达式有错。但不会改。
我运行了一下,1楼的貌似也不对。
q107770540 2012-03-07
  • 打赏
  • 举报
回复
	var textRegExp = /\b([A-Za-z\u00C0-\u1FFF\u2800-\uFFFD'\-]+)\b\s+\1/gi;  //定义一个重复的单词
textEscape = textSource.replace(textRegExp, "$1");
  • 打赏
  • 举报
回复
用正则处理这样的问题是很辣手的,因为最麻烦的是对于单词分解的判断。
下面的代码做了写改进但也不完美,有些标点符号还是被落下了。
<!DOCTYPE HTML>
<html lang="en-US">
<head>
<meta charset="UTF-8">
<title></title>

<style type="text/css">
#test1,#test2{width:500px;height:200px;border:1px solid #00f;margin-bottom:20px;}
</style>
<script type="text/javascript">
window.onload = function () {
var test1 = document.getElementById('test1'),
test2 = document.getElementById('test2'),
textSource = test1.innerHTML,
textEscape;

var textRegExp = new RegExp((
"(\\w+(?:\\W+\\w+)*)" //单词或单词组成的词组中间用空格或标点隔开
+ "((?:\\W+\\w+)*?)" //重复内容之间的n个单词, 单词之间通过空格或标点隔开
+ "(\\W+)" //重复内容之前的分隔符,比如空格或标点
+ "\\1" //重复内容
), "g");
textEscape = textSource.replace(textRegExp,"$1$2$3");
test2.innerHTML = textEscape;
}
</script>
</head>
<body>
<div id="test1">activity,activity, Sizzle-Sizzle It It! hello world! hello world! activity Sizzle It, activity Sizzle It,</div>
<div id="test2"></div>
</body>
</html>
峭沙 2012-03-07
  • 打赏
  • 举报
回复
改进了下正则,前面的有点小问题
<!DOCTYPE HTML>
<html lang="en-US">
<head>
<meta charset="UTF-8">
<title></title>

<style type="text/css">
#test1,#test2{width:500px;height:200px;border:1px solid #00f;margin-bottom:20px;}
</style>
<script type="text/javascript">
window.onload = function () {
var test1 = document.getElementById('test1'),
test2 = document.getElementById('test2'),
textRegExp = /\b([A-Za-z\u00C0-\u1FFF\u2800-\uFFFD\'\-]+)\b(?=.*\b\1\b)/ig,
textSource = test1.innerHTML.split('').reverse().join(''),
textEscape = textSource.replace(textRegExp,"").split('').reverse().join('');
test2.innerHTML = textEscape;
}
</script>
</head>
<body>
<div id="test1">activity Sizzle It! is is the expert in producing sizzle reels that capture your message and captivate your audience — all with creativity and style. expert sizzle reels that capture</div>
<div id="test2"></div>
</body>
</html>
  • 打赏
  • 举报
回复
那段表达式的意思是只有一个单词出现之后紧接着又出现一次才会被替换。
输入如下单词试下:
activity activity Sizzle Sizzle It It!

87,921

社区成员

发帖
与我相关
我的任务
社区描述
Web 开发 JavaScript
社区管理员
  • JavaScript
  • 无·法
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧