也是关于删除重复数据

hsboy 2001-04-21 12:21:00
sql server 7.0
我有一个表用来记录留言,结构如下:
boardid:留言板id
userid:留言者id
msg:留言内容
[date]:留言日期

现在的问题是有一些用户疯狂灌水。我想把重复的数据(就是在同一个留言板(boardid)中的同一个用户发表的重复内容)删除,只保留一条。
请问该如何写sql语句?
如果您的方法是使用游标或者临时表或者视图,那么不必回复此问题。如果您只在oracle下试过并且是利用oracle一些特有的功能来实现的,也不必回复。
...全文
175 20 打赏 收藏 转发到动态 举报
写回复
用AI写文章
20 条回复
切换为时间正序
请发表友善的回复…
发表回复
hughie 2001-04-28
  • 打赏
  • 举报
回复
wwq_jeff(飞雪) 谢谢,不看不知道!
wwq_jeff 2001-04-24
  • 打赏
  • 举报
回复
嘿嘿!高手如云。不过hughie(雨神) 的SQL比我的SQL速度要慢很多,将近是我的1。5倍。毕竟用in 是很耗时间的。不信大家可以试试。
[hughie(雨神)]
Delete from table_name
where boardid not in
(select max(boardid) as boardid from table_name group by userid,msg)

[wwq_jeff(飞雪)]
delete t1 from table1 t1,(select v1.*,(select top 1 u1.fdate from table1 u1 where u1.boardid = v1.boardid and
u1.userid = v1.userid and u1.msg = v1.msg) fdate
from (select distinct boardid,userid,msg from table1) v1) t2
where t1.boardid = t2.boardid and t1.userid = t2.userid
and t1.msg = t2.msg and t1.fdate <> t2.fdate
hughie 2001-04-24
  • 打赏
  • 举报
回复
不客气,大家一起学习。
hsboy 2001-04-24
  • 打赏
  • 举报
回复
hughie:专家哦。晚些给你分。
wwq_jeff 2001-04-24
  • 打赏
  • 举报
回复
我用querry anaylzer测过的,大约是1.4几的倍数。我的SQL稍长了些,还可以在简化的。另外加top 速度不会慢的,只会快。

select * from t_item ---use 86.71%

select top 1 * from t_Item --use 13.29%

另:测试结果
速度最快的还是根据hughie(雨神) 改编的SQL :
delete t1 from #tt t1,(select v1.boardid,v1.userid,v1.msg,max(v1.fdate) fdate
from #tt v1 group by v1.boardid,v1.userid,v1.msg) t2
where t1.boardid = t2.boardid and t1.userid = t2.userid
and t1.msg = t2.msg and t1.fdate <> t2.fdate

测试数据:
--cost 20.24% <sql from hughie(雨神) >
Delete from #tt
where boardid not in
(select max(boardid) as boardid from #tt group by userid,msg)

delete from #tt
insert into #tt values(1,1,"aa","2001-04-23 17:22:57.17")
insert into #tt values(1,1,"aa","2001-04-23 17:22:57.47")
insert into #tt values(1,1,"aa","2001-04-23 17:22:57.117")
insert into #tt values(1,2,"aa","2001-04-23 17:22:57.17")
insert into #tt values(1,2,"aa","2001-04-23 17:22:57.117")
insert into #tt values(2,1,"aa","2001-04-23 17:22:57.17")
insert into #tt values(2,1,"aa","2001-04-23 17:22:57.117")

---cost 22.07% <sql changed from wwq_jeff>
delete t1 from #tt t1,(select distinct v1.boardid,v1.userid,v1.msg,(select top 1 u1.fdate from #tt u1 where u1.boardid = v1.boardid and
u1.userid = v1.userid and u1.msg = v1.msg) fdate
from #tt v1) t2
where t1.boardid = t2.boardid and t1.userid = t2.userid
and t1.msg = t2.msg and t1.fdate <> t2.fdate

delete from #tt
insert into #tt values(1,1,"aa","2001-04-23 17:22:57.17")
insert into #tt values(1,1,"aa","2001-04-23 17:22:57.47")
insert into #tt values(1,1,"aa","2001-04-23 17:22:57.117")
insert into #tt values(1,2,"aa","2001-04-23 17:22:57.17")
insert into #tt values(1,2,"aa","2001-04-23 17:22:57.117")
insert into #tt values(2,1,"aa","2001-04-23 17:22:57.17")
insert into #tt values(2,1,"aa","2001-04-23 17:22:57.117")

--cost 14.31% <sql from wwq_jeff>
delete t1 from #tt t1,(select v1.*,(select top 1 u1.fdate from #tt u1 where u1.boardid = v1.boardid and
u1.userid = v1.userid and u1.msg = v1.msg) fdate
from (select distinct boardid,userid,msg from #tt) v1) t2
where t1.boardid = t2.boardid and t1.userid = t2.userid
and t1.msg = t2.msg and t1.fdate <> t2.fdate

delete from #tt
insert into #tt values(1,1,"aa","2001-04-23 17:22:57.17")
insert into #tt values(1,1,"aa","2001-04-23 17:22:57.47")
insert into #tt values(1,1,"aa","2001-04-23 17:22:57.117")
insert into #tt values(1,2,"aa","2001-04-23 17:22:57.17")
insert into #tt values(1,2,"aa","2001-04-23 17:22:57.117")
insert into #tt values(2,1,"aa","2001-04-23 17:22:57.17")
insert into #tt values(2,1,"aa","2001-04-23 17:22:57.117")

--cost 10.53 <sql changed from hughie(雨神)>
delete t1 from #tt t1,(select v1.boardid,v1.userid,v1.msg,max(v1.fdate) fdate
from #tt v1 group by v1.boardid,v1.userid,v1.msg) t2
where t1.boardid = t2.boardid and t1.userid = t2.userid
and t1.msg = t2.msg and t1.fdate <> t2.fdate

hughie 2001-04-24
  • 打赏
  • 举报
回复
to [wwq_jeff(飞雪)]不错,高手。
用IN语句的确会慢一些,但不会象你说的慢那么多,几乎是察觉不到的,我的语句是因为在子查询里用了Group by 再加上 in也许会慢一些,但我想hsboy不是要把它嵌到程序里去执行,不会在乎值微乎其微的差距的。
你在语句里加了Top而且还有一堆的嵌套查询,速度也不会快到哪去吧,可惜我没有环境测试,
也没有精力建环境,不然我倒是要测一测。
hsboy 2001-04-24
  • 打赏
  • 举报
回复
wwq_jeff:
谢谢你的热心。我会给你分的。
gxdq 2001-04-23
  • 打赏
  • 举报
回复

用临时表制作。











wwq_jeff 2001-04-23
  • 打赏
  • 举报
回复
假设表名为table1

delete t1 from table1 t1,(select v1.*,(select top 1 u1.fdate from table1 u1 where u1.boardid = v1.boardid and
u1.userid = v1.userid and u1.msg = v1.msg) fdate
from (select distinct boardid,userid,msg from table1) v1) t2
where t1.boardid = t2.boardid and t1.userid = t2.userid
and t1.msg = t2.msg and t1.fdate <> t2.fdate

hughie 2001-04-23
  • 打赏
  • 举报
回复
你可以试一试
Delete from table_name
where boardid not in
(select max(boardid) as boardid from table_name group by userid,msg)
DeD 2001-04-23
  • 打赏
  • 举报
回复
create table #tmpData(boardid int,userid int,msg varchar(8000),[date] datetime)

--假设你的表名为test

insert into #tmpdata(boardid,userid,msg,[date])
select boardid,userid,msg,min([date])
from test
group by boardid,userid,msg

delete test from test ,#tmpData
where test.boardid=#tmpData.boardid
and test.msg=#tmpData.msg
and test.userid=#tmpData.userid
and test.[date]>#tmpData.[date]
drop table #tmpData
juneball 2001-04-22
  • 打赏
  • 举报
回复
关注,关注,再关注
hsboy 2001-04-22
  • 打赏
  • 举报
回复
我不understand,用你的方法,重复的数据将会被全部删除,但我想还要留下一条。:)
ccat 2001-04-22
  • 打赏
  • 举报
回复
写一个select distinct ... from ...把数据倒到一个临时表,再delete原来的表,把Distinct后的数据再insert回来,understand?
Springer 2001-04-21
  • 打赏
  • 举报
回复
考虑boardid, userid, msg 相同的记录只保留date最小的,利用子查询。
Fxx 2001-04-21
  • 打赏
  • 举报
回复
Delete table_name where rowid<>(
select max(rowid) from table_name t where
table_name.boardid=t.boardid);
假设boardid为主键,如果主键不止一个,
在table_name.boardid=t.boardid
后继续and即可
luhongjun 2001-04-21
  • 打赏
  • 举报
回复
关注
jjdelphi 2001-04-21
  • 打赏
  • 举报
回复
关注
WHQ 2001-04-21
  • 打赏
  • 举报
回复
做一触发器不上他加相同内容的数据
CREATE TRIGGER tr_insert
INSERT ON Board
FOR EACH ROW
BEGIN
IF EXIST SELECT * FROM Board WHERE UserID=inserted.UserID AND
TRIM(msg) = TRIM(inserted.msg) THEN
raise_error; -- 具体怎么写查手册
END IF;
END;
yangzi 2001-04-21
  • 打赏
  • 举报
回复
困难

34,593

社区成员

发帖
与我相关
我的任务
社区描述
MS-SQL Server相关内容讨论专区
社区管理员
  • 基础类社区
  • 二月十六
  • 卖水果的net
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧