再次请教一个统计查询

lsgt 2009-01-14 03:36:31

数据表如下:(实际记录很多,上千万条)
drop table if exists tmpt;
Create temporary table TestBase (intime datetime, id int,val1 int,val2 int);
Insert into tmpt values('2000-1-1' , 1,100,110);
Insert into tmpt values('2000-1-1' , 1,500,110);
Insert into tmpt values('2003-1-1' , 1,120,130);
Insert into tmpt values('2005-1-1' , 1,140,150);
Insert into tmpt values('2000-1-1' , 2,160,170);
Insert into tmpt values('2001-1-1' , 2,180,190);[code=SQL]

Insert into tmpt values('2004-1-1' , 2,200,210);
即:
intime ID VAL1 VAL2
2000-1-1 1 100 110
2000-1-1 1 500 110
2003-1-1 1 120 130
2005-1-1 1 140 150
2000-1-1 2 150 170
2001-1-1 2 180 190
2004-1-1 2 200 220

要求是给定统计起始日期(例如2000-1-1),结束日期(2007-12-31),要求按年统计ID为1的val1的平均值,ID为2的val2的和,且如果没有记录的年份,统计值填为0。即为以下结果:
年份 ID AVG(VAL1) ID SUM(VAL2)
2000 1 300 2 170
2001 1 0 2 190
2002 1 0 2 0
2003 1 120 2 0
2004 1 0 2 210
2005 1 140 2 0
2006 1 0 2 0

另外还需考虑除了按年统计之外,还可按月、日统计,其他要求相同。

整了两天了,大家帮帮忙。谢谢。
[/code]
...全文
82 18 打赏 收藏 转发到动态 举报
写回复
用AI写文章
18 条回复
切换为时间正序
请发表友善的回复…
发表回复
unixlinuxsys 2009-01-16
  • 打赏
  • 举报
回复
[Quote=引用 4 楼 WWWWA 的回复:]
呵呵,要按年、ID分组
select date_format(intime,'%Y-%m-%d'),id,
COALESCE(
sum(if(val1>0,val1,0))/sum(if(val1>0,1,0)),0) as new
from (
select * from tmpt
union
select '2000-01-01',1,0,0
union
select '2001-01-01',1,0,0
union
select '2002-01-01',1,0,0
union
select '2003-01-01',1,0,0
union
select '2004-01-01',1,0,0
union
select '2005-01-01',1,0,0
union
select '2006-01-01',1,0,0) a

[/Quote]


支持这个!
lsgt 2009-01-15
  • 打赏
  • 举报
回复
最终的测试结果如下:


//创建测试表
drop table if exists tmpt;
Create temporary table tmpt (intime datetime, id int,val1 int,val2 int);
Insert into tmpt values('2000-1-1' , 1,100,110);
Insert into tmpt values('2000-1-1' , 1,500,110);
Insert into tmpt values('2003-1-1' , 1,120,130);
Insert into tmpt values('2005-1-1' , 1,140,150);
Insert into tmpt values('2000-1-1' , 2,160,170);
Insert into tmpt values('2001-1-1' , 2,180,190);
Insert into tmpt values('2004-1-1' , 2,200,210);

//创建日历表,ACMAIN_CHM 的建议非常好,实际中会建立一个静态的日历表,50年按天也才1W6+记录。

drop table if exists calendar;
create table calendar(cdate datetime , constraint pk_calendar primary key (cdate));
insert into calendar value ('2000-01-01');
insert into calendar value ('2000-01-02');
insert into calendar value ('2000-01-03');
..
insert into calendar value ('2007-12-31');

//用ACMAIN_CHM的方法

select year(calendar.cdate),
1 as id1, ifnull(avg(IF(id=1,VAL1,null)),0) as avg1,
2 as id2, ifnull(sum(IF(id=2,VAL1,null)),0) as sum1,
1 as id3, ifnull(max(IF(id=1,VAL1,null)),0) as max1
from calendar left join tmpt on date(calendar.cdate)=date(tmpt.intime)
where cdate between '2000-1-1 0:0:0' and '2008-1-1 23:59:59'
group by year(calendar.cdate)

这个方法很慢,我用在我的实际的测试表里(36W+记录),等了十几分钟没出来结果,所以综合了WWWWA 的方法,写成这样:


select btime,1,sum(AVG1),2,sum(SUM1),1,sum(MAX1) from
(
select year(cdate) as btime,0 as id1,0 as AVG1,0 as id2,0 as SUM1,0 as id3,0 as MAX1 from calendar where cdate between '2000-1-1 0:0:0' and '2008-1-1 23:59:59' group by btime
union
(select year(intime) AS btime,
1 AS id1,avg(IF(id=1,VAL1,null)) as AVG1,
2 AS id2,sum(IF(id=2,VAL1,null)) as SUM1,
1 AS id3,max(IF(id=1,VAL1,null)) as MAX1
from tmpt where intime between '2000-1-1 0:0:0' and '2008-1-1 23:59:59' group by btime)
) T
group by T.btime


这句16ms就出来了正确结果。这个句子能很方便的动态构造,完美解决了我的问题。

谢谢两位热心的帮助。
wwwwb 2009-01-15
  • 打赏
  • 举报
回复
用1 as id1,2 as id2,
有点问题,如有上N个ID的话,要写多个,一般用UNION。
lsgt 2009-01-14
  • 打赏
  • 举报
回复
居然要发帖一天后才可以加分,明天来揭帖
lsgt 2009-01-14
  • 打赏
  • 举报
回复
再次谢谢
ACMAIN_CHM 2009-01-14
  • 打赏
  • 举报
回复
[Quote=引用 11 楼 lsgt 的回复:]
麻烦帮我看一下这样可以不:
select YEAR(intime),
1 as id1,ifnull(avg(if(id=1,VAL1,null)),0) as avg1,
2 as id2,ifnull(sum(if(id=2,VAL2,null)),0) as sum2
from tmpt
group by year(intime);

这样出来的结果就是两个统计的量都没有的年份,最后统计结果也没有。如果要解决这个问题,就只能做一个日历表了。不过先就这样吧,这样非常简洁,而且很适合扩充:实际程序里面统计起止时间是可任意选定的,要统计的…
[/Quote]

因为你还要按月,按日,所以建议你话一个日历表,建好表,利用EXCEL下拉公式产生所有的日历的INSERT语句,然后直接贴到mysql命令中,几分钟的操作。
ACMAIN_CHM 2009-01-14
  • 打赏
  • 举报
回复
[Quote=引用 10 楼 lsgt 的回复:]
顺便问一下ACMAIN_CHM,你那个表格怎么弄进来的,很漂亮啊
[/Quote]

直接在mysql中运行的结果啊。
lsgt 2009-01-14
  • 打赏
  • 举报
回复
麻烦帮我看一下这样可以不:
select YEAR(intime),
1 as id1,ifnull(avg(if(id=1,VAL1,null)),0) as avg1,
2 as id2,ifnull(sum(if(id=2,VAL2,null)),0) as sum2
from tmpt
group by year(intime);

这样出来的结果就是两个统计的量都没有的年份,最后统计结果也没有。如果要解决这个问题,就只能做一个日历表了。不过先就这样吧,这样非常简洁,而且很适合扩充:实际程序里面统计起止时间是可任意选定的,要统计的量也是可以任意添加,而表中不同的ID号最多会有200多。

再次谢谢两位。不过怎么加多点分呢?
lsgt 2009-01-14
  • 打赏
  • 举报
回复
顺便问一下ACMAIN_CHM,你那个表格怎么弄进来的,很漂亮啊
lsgt 2009-01-14
  • 打赏
  • 举报
回复
谢谢两位,我再领会一下。

如果解决了再给两位加分。
WWWWA 2009-01-14
  • 打赏
  • 举报
回复
如用临时表话,ID取工作表的ID即可
WWWWA 2009-01-14
  • 打赏
  • 举报
回复
注意:上述代码假设VAL1、VAL2均不为0,如实际记录中有0,则将UNION中的0改为
其它数,比如-99
WWWWA 2009-01-14
  • 打赏
  • 举报
回复
如年比较多,生成临时表,与工作表连接,代码如上述,
但如果你要每一个ID的话,则每年加一个ID,如果有N个ID的话,
数据量比较在,比如2000-2006,如要2个ID,则要加入14条记录

完整代码:
select date_format(intime,'%Y-%m-%d'),
COALESCE(
sum(if(id=1 and val1>0,val1,0))/sum(if(id=1 and val1>0,1,0)),0) as new1,
COALESCE(
sum(if(id=2 and val2>0,val2,0))/sum(if(id=2 and val2>0,1,0)),0) as new2

from (
select * from tmpt
union
select '2000-01-01',1,0,0
union
select '2001-01-01',1,0,0
union
select '2002-01-01',1,0,0
union
select '2003-01-01',1,0,0
union
select '2004-01-01',1,0,0
union
select '2005-01-01',1,0,0
union
select '2006-01-01',1,0,0
union
select '2000-01-01',2,0,0
union
select '2001-01-01',2,0,0
union
select '2002-01-01',2,0,0
union
select '2003-01-01',2,0,0
union
select '2004-01-01',2,0,0
union
select '2005-01-01',2,0,0
union
select '2006-01-01',2,0,0
) a
group by date_format(intime,'%Y-%m-%d')
ACMAIN_CHM 2009-01-14
  • 打赏
  • 举报
回复
按月如下,按周,按日,你可自行调整一下 DATE_FORMAT(intime,'%Y-%m'),详见MySQL参考手册。

select DATE_FORMAT(intime,'%Y-%m'),
1 as id1,ifnull(sum(IF(id=1,VAL1,0))/sum(IF(id=1,1,0)),0) as avg1,
2 as id2,ifnull(sum(IF(id=2,VAL2,0))/sum(IF(id=2,1,0)),0) as avg2
from calendar left join tmpt on calendar.cdate=tmpt.intime
group by DATE_FORMAT(intime,'%Y-%m')





DATE_FORMAT(date,format)

Formats the date value according to the format string.

The following specifiers may be used in the format string. The ‘%’ character is required before format specifier characters.

Specifier Description
%a Abbreviated weekday name (Sun..Sat)
%b Abbreviated month name (Jan..Dec)
%c Month, numeric (0..12)
%D Day of the month with English suffix (0th, 1st, 2nd, 3rd, …)
%d Day of the month, numeric (00..31)
%e Day of the month, numeric (0..31)
%f Microseconds (000000..999999)
%H Hour (00..23)
%h Hour (01..12)
%I Hour (01..12)
%i Minutes, numeric (00..59)
%j Day of year (001..366)
%k Hour (0..23)
%l Hour (1..12)
%M Month name (January..December)
%m Month, numeric (00..12)
%p AM or PM
%r Time, 12-hour (hh:mm:ss followed by AM or PM)
%S Seconds (00..59)
%s Seconds (00..59)
%T Time, 24-hour (hh:mm:ss)
%U Week (00..53), where Sunday is the first day of the week
%u Week (00..53), where Monday is the first day of the week
%V Week (01..53), where Sunday is the first day of the week; used with %X
%v Week (01..53), where Monday is the first day of the week; used with %x
%W Weekday name (Sunday..Saturday)
%w Day of the week (0=Sunday..6=Saturday)
%X Year for the week where Sunday is the first day of the week, numeric, four digits; used with %V
%x Year for the week, where Monday is the first day of the week, numeric, four digits; used with %v
%Y Year, numeric, four digits
%y Year, numeric (two digits)
%% A literal ‘%’ character
%x x, for any ‘x’ not listed above
WWWWA 2009-01-14
  • 打赏
  • 举报
回复
呵呵,要按年、ID分组
select date_format(intime,'%Y-%m-%d'),id,
COALESCE(
sum(if(val1>0,val1,0))/sum(if(val1>0,1,0)),0) as new
from (
select * from tmpt
union
select '2000-01-01',1,0,0
union
select '2001-01-01',1,0,0
union
select '2002-01-01',1,0,0
union
select '2003-01-01',1,0,0
union
select '2004-01-01',1,0,0
union
select '2005-01-01',1,0,0
union
select '2006-01-01',1,0,0) a
where id=1
group by date_format(intime,'%Y-%m-%d'),id
ACMAIN_CHM 2009-01-14
  • 打赏
  • 举报
回复
如果需要显示 2006
则你需要另有一张日历表

create table calendar(
cdate datetime ,
constraint pk_calendar primary key (cdate)
)

insert into calendar value ('2000-01-01');
insert into calendar value ('2000-01-02');
insert into calendar value ('2000-01-03');
..
insert into calendar value ('2007-12-31');



然后用left join 查询就行了。

mysql> select YEAR(intime),
-> 1 as id1,ifnull(sum(IF(id=1,VAL1,0))/sum(IF(id=1,1,0)),0) as avg1,
-> 2 as id2,ifnull(sum(IF(id=2,VAL2,0))/sum(IF(id=2,1,0)),0) as avg2
-> from calendar left join tmpt on calendar.cdate=tmpt.intime
-> group by YEAR(intime);
+--------------+-----+--------+-----+--------+
| YEAR(intime) | id1 | avg1 | id2 | avg2 |
+--------------+-----+--------+-----+--------+
| 2000 | 1 | 300.00 | 2 | 170.00 |
| 2000 | 1 | 300.00 | 2 | 170.00 |
| 2001 | 1 | 0.00 | 2 | 190.00 |
| 2003 | 1 | 120.00 | 2 | 0.00 |
| 2004 | 1 | 0.00 | 2 | 210.00 |
| 2005 | 1 | 140.00 | 2 | 0.00 |
| 2006 | 1 | 0.00 | 2 | 0.00 |
| 2007 | 1 | 0.00 | 2 | 0.00 |
+--------------+-----+--------+-----+--------+

1 row in set (0.00 sec)

mysql>
ACMAIN_CHM 2009-01-14
  • 打赏
  • 举报
回复
select YEAR(intime),
1 as id1,ifnull(sum(IF(id=1,VAL1,0))/sum(IF(id=1,1,0)),0) as avg1,
2 as id2,ifnull(sum(IF(id=2,VAL2,0))/sum(IF(id=2,1,0)),0) as avg2
from tmpt
group by YEAR(intime)
ACMAIN_CHM 2009-01-14
  • 打赏
  • 举报
回复
mysql> select * from tmpt;
+---------------------+------+------+------+
| intime | id | val1 | val2 |
+---------------------+------+------+------+
| 2000-01-01 00:00:00 | 1 | 100 | 110 |
| 2000-01-01 00:00:00 | 1 | 500 | 110 |
| 2003-01-01 00:00:00 | 1 | 120 | 130 |
| 2005-01-01 00:00:00 | 1 | 140 | 150 |
| 2000-01-01 00:00:00 | 2 | 160 | 170 |
| 2001-01-01 00:00:00 | 2 | 180 | 190 |
| 2004-01-01 00:00:00 | 2 | 200 | 210 |
+---------------------+------+------+------+
7 rows in set (0.00 sec)


mysql> select YEAR(intime),1 as id1,ifnull(sum(IF(id=1,VAL1,0))/sum(IF(id=1,1,0)),0) as avg1,
-> 2 as id2,ifnull(sum(IF(id=2,VAL2,0))/sum(IF(id=2,1,0)),0) as avg2
-> from tmpt
-> group by YEAR(intime);
+--------------+-----+--------+-----+--------+
| YEAR(intime) | id1 | avg1 | id2 | avg2 |
+--------------+-----+--------+-----+--------+
| 2000 | 1 | 300.00 | 2 | 170.00 |
| 2001 | 1 | 0.00 | 2 | 190.00 |
| 2003 | 1 | 120.00 | 2 | 0.00 |
| 2004 | 1 | 0.00 | 2 | 210.00 |
| 2005 | 1 | 140.00 | 2 | 0.00 |
+--------------+-----+--------+-----+--------+
5 rows in set (0.00 sec)

56,679

社区成员

发帖
与我相关
我的任务
社区描述
MySQL相关内容讨论专区
社区管理员
  • MySQL
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧