一个SQL的优化，谢谢！

manchun 2009-08-04 03:42:26

select t.wp_status, t.is_send, count(1) "assignCount"

     from (select t1.wp_status, t1.is_send

             from t_crm_wp t1

            where EXISTS (SELECT area_id

                     FROM t_area a

                    WHERE area_id = t1.area_id

                    START WITH a.area_id = 'GZ0000000000'

                   CONNECT BY PRIOR a.area_id = a.parent_area_id)

              AND NOT EXISTS

            (select 1

                     from t_crm_wp t

                    where t.wp_id = t1.wp_id

                      and (EXISTS

                           (select 1

                              from bst_sys_sysuser a

                             where (a.userid = t.user_id or

                                   a.userid = t.check_id)

                               and EXISTS

                             (select 1

                                      from t_area c

                                     where a.area_id = c.area_id

                                       and c.area_id <> 'GZ0000000000'

                                     start with c.area_id = 'GZ0000000000'

                                    connect by prior c.parent_area_id = c.area_id)))

                      and t.is_send = '1')) t

    group by rollup(t.wp_status, t.is_send)

这个sql在本地执行需要1.6秒，在生产环境上是6分钟，怎么优化啊，我用的是ibatis，用##带变量，所以算是绑定变量方式传入的。

本地数据10万左右吧，生产库上的有几百万吧！

...全文

264 16 打赏收藏转发到动态举报

写回复

用AI写文章

16 条回复

切换为时间正序

请发表友善的回复…

发表回复

非萨范德萨发 2009-08-12

打赏
举报

[Quote=引用 7 楼 oraclelogan 的回复:]
随便写了下，由于没有环境无法测试，楼主自己测试下吧!

SQL codeselect t.wp_status, t.is_send,count(1) "assignCount"from (select t1.wp_status, t1.is_sendfrom t_crm_wp t1whereEXISTS (SELECT area_idFROM t_area aWHERE area_id= t1.area_id
STARTWITH a.area_id='GZ0000000000'
CONNECTBY PRIOR a.area_id= a.parent_area_id)AND
(NOTEXISTS
(select1from bst_sys_sysuser awhere (a.userid= t1.user_idor
a.userid= t1.check_id)andEXISTS
(select1from t_area cwhere a.area_id= c.area_idand c.area_id<>'GZ0000000000'
startwith c.area_id='GZ0000000000'
connectby prior c.parent_area_id= c.area_id))or t1.is_send='1'
)/*(select 1
from t_crm_wp t
where t.wp_id = t1.wp_id
and (EXISTS
(select 1
from bst_sys_sysuser a
where (a.userid = t.user_id or
a.userid = t.check_id)
and EXISTS
(select 1
from t_area c
where a.area_id = c.area_id
and c.area_id <> 'GZ0000000000'
start with c.area_id = 'GZ0000000000'
connect by prior c.parent_area_id = c.area_id)))
and t.is_send = '1')*/
) tgroupby rollup(t.wp_status, t.is_send)
[/Quote]

下面这段条件写错了吧：

AND 

              (NOT EXISTS

                   (select 1

                              from bst_sys_sysuser a

                             where (a.userid = t1.user_id or

                                   a.userid = t1.check_id)

                               and EXISTS

                             (select 1

                                      from t_area c

                                     where a.area_id = c.area_id

                                       and c.area_id <> 'GZ0000000000'

                                     start with c.area_id = 'GZ0000000000'

                                    connect by prior c.parent_area_id = c.area_id))

                 or t1.is_send = '1'

               )

楼主原语句的本意是：（统计）查出t_crm_wp中area_id='GZ0000000000'及其下属组织的数据，并排除在t_crm_wp中is_send='1'的check_id或user_id对应bst_sys_sysuser中User_id的组织为area_id='GZ0000000000'上属组织的数据。
楼主测试下，你原来的SQL结果和，oraclelogan 的优化SQL,以及下面的SQL结果是否一致：）

SELECT t1.wp_status, t1.is_send

         FROM   t_crm_wp t1

         WHERE  EXISTS (SELECT area_id

                                     FROM   t_area a

                                     WHERE  area_id = t1.area_id

                                     START  WITH a.area_id = 'GZ0000000000'

                                     CONNECT BY PRIOR a.area_id = a.parent_area_id)

             AND    NOT EXISTS (SELECT 1

         									FROM   bst_sys_sysuser a

                                            WHERE  (a.userid = t.user_id OR a.userid = t.check_id)

                                            AND    EXISTS  (SELECT 1

                                                                      FROM   t_area c

                                                                      WHERE  a.area_id = c.area_id

                                                                      AND    c.area_id <> 'GZ0000000000'

                                                                      START  WITH c.area_id = 'GZ0000000000'

                                                                      CONNECT BY PRIOR c.parent_area_id = c.area_id)

                                            AND    t.is_send = '1')

vber1010 2009-08-12

打赏
举报

oracletoad 2009-08-12

打赏
举报

关注，学习，谢谢！

zzwei321 2009-08-07

打赏
举报

select t.wp_status, t.is_send, count(1) "assignCount"
from (select t1.wp_status, t1.is_send
from t_crm_wp t1
where EXISTS (SELECT area_id
FROM t_area a
WHERE area_id = t1.area_id
START WITH a.area_id = 'GZ0000000000'
CONNECT BY PRIOR a.area_id = a.parent_area_id)
AND
(NOT EXISTS
(select 1
from bst_sys_sysuser a
where (a.userid = t1.user_id or
a.userid = t1.check_id)
and EXISTS
(select 1
from t_area c
where a.area_id = c.area_id
and c.area_id <> 'GZ0000000000'
start with c.area_id = 'GZ0000000000'
connect by prior c.parent_area_id = c.area_id))
or t1.is_send = '1'
)
/*(select 1
from t_crm_wp t
where t.wp_id = t1.wp_id
and (EXISTS
(select 1
from bst_sys_sysuser a
where (a.userid = t.user_id or
a.userid = t.check_id)
and EXISTS
(select 1
from t_area c
where a.area_id = c.area_id
and c.area_id <> 'GZ0000000000'
start with c.area_id = 'GZ0000000000'
connect by prior c.parent_area_id = c.area_id)))
and t.is_send = '1')*/
) t
group by rollup(t.wp_status, t.is_send)

7樓的或者樓主能解釋下：爲什麽刪掉紅色的這句呢？

manchun 2009-08-06

打赏
举报

谢谢大家，我已经改定了，谢谢！

kiuyongfa218 2009-08-06

打赏
举报

SQL语句的优化，确实是一个问题，特别是当数据量非常大的时候，要注意表之间的关联关系，减少对表的扫描次数，减少表之间的相互关系。

manchun 2009-08-05

打赏
举报

manchun 2009-08-05

打赏
举报

[Quote=引用 4 楼 majy 的回复:]
你不把执行计划贴出来，没法搞啊，都不知道你的系统是怎么执行的，能调出啥东东来？？？

试试把这个语句改一下：
select t1.wp_status, t1.is_send
from t_crm_wp t1
where EXISTS (SELECT area_id
FROM t_area a
WHERE area_id = t1.area_id
START WITH a.area_id = 'GZ0000000000'
CONNECT BY PRIOR a.area_id = a.parent_area_id

－－－－－－
select t1.wp_status, t1.is_send
from t_crm_wp t1,
(SELECT area_id
FROM t_area a
START WITH a.area_id = 'GZ0000000000'
CONNECT BY PRIOR a.area_id = a.parent_area_id) t2
where t1.area_id = t2.area_id

下面的connect by语句也这么改

然后，在t_area表上加上字段 area_id的索引，我看，你用bitmap索引会非常好，你的area_id的distinct值会比较小，我想

做完索引，运行
dbms_stats.gather_table_stat(user,'T_AREA', method_opt=>'for all indexed columns', cascade=>true);
[/Quote]

我的area_id索引原来已经存在了！

我执行你的sql语句报如下错误啊：

SQL>  dbms_stats.gather_table_stat(user,'T_AREA', method_opt=>'for all indexed columns', cascade=>true);



dbms_stats.gather_table_stat(user,'T_AREA', method_opt=>'for all indexed columns', cascade=>true)



ORA-00900: invalid SQL statement

怎么搞？

oraclelogan 2009-08-05

打赏
举报

随便写了下，由于没有环境无法测试，楼主自己测试下吧!

select t.wp_status, t.is_send, count(1) "assignCount"

     from (select t1.wp_status, t1.is_send

             from t_crm_wp t1

            where EXISTS (SELECT area_id

                     FROM t_area a

                    WHERE area_id = t1.area_id

                    START WITH a.area_id = 'GZ0000000000'

                   CONNECT BY PRIOR a.area_id = a.parent_area_id)

              AND 

              (NOT EXISTS

                   (select 1

                              from bst_sys_sysuser a

                             where (a.userid = t1.user_id or

                                   a.userid = t1.check_id)

                               and EXISTS

                             (select 1

                                      from t_area c

                                     where a.area_id = c.area_id

                                       and c.area_id <> 'GZ0000000000'

                                     start with c.area_id = 'GZ0000000000'

                                    connect by prior c.parent_area_id = c.area_id))

                 or t1.is_send = '1'

               )

            /*(select 1

                     from t_crm_wp t

                    where t.wp_id = t1.wp_id

                      and (EXISTS

                           (select 1

                              from bst_sys_sysuser a

                             where (a.userid = t.user_id or

                                   a.userid = t.check_id)

                               and EXISTS

                             (select 1

                                      from t_area c

                                     where a.area_id = c.area_id

                                       and c.area_id <> 'GZ0000000000'

                                     start with c.area_id = 'GZ0000000000'

                                    connect by prior c.parent_area_id = c.area_id)))

                      and t.is_send = '1')*/

         ) t

    group by rollup(t.wp_status, t.is_send)

manchun 2009-08-05

打赏
举报

[Quote=引用 5 楼 oraclelogan 的回复:]
引用楼主 manchun 的回复:
SQL codeselect t.wp_status, t.is_send,count(1) "assignCount"from (select t1.wp_status, t1.is_sendfrom t_crm_wp t1whereEXISTS (SELECT area_idFROM t_area aWHERE area_id= t1.area_id
STARTWITH a.area_id='GZ0000000000'
CONNECTBY PRIOR a.area_id= a.parent_area_id)ANDNOTEXISTS
(select1from t_crm_wp twhere t.wp_id= t1.wp_idand (EXISTS
(select1from bst_sys_sysuser awhere (a.userid= t.user_idor
a.userid= t.check_id)andEXISTS
(select1from t_area cwhere a.area_id= c.area_idand c.area_id <>'GZ0000000000'
startwith c.area_id='GZ0000000000'
connectby prior c.parent_area_id= c.area_id)))and t.is_send='1')) tgroupby rollup(t.wp_status, t.is_send)

这个sql在本地执行需要1.6秒，在生产环境上是6分钟，怎么优化啊，我用的是ibatis，用##带变量，所以算是绑定变量方式传入的。

本地数据10万左右吧，生产库上的有几百万吧！

你的主表是t_crm_wp，你说在本地数据有10多万，生产环境有几百万吧！

但是你的no exists嵌套查询：
not exists(select 1
from t_crm_wp t
where t.wp_id = t1.wp_id

这样一搞，不就是几百万*几百万的开销吗？这样很消耗资源的，建议把no exists里面对t_crm_wp的条件语句移到外面来，就是不要t_crm_wp*t_crm_wp 的遍历，否则就麻烦了！

当然以上几位说的connect by 用着where之后的条件判断也是比较消耗资源的啊！
[/Quote]

哦，那我应该怎么改呢，这个sql不是我写的啊！写的那个人已经离职了！

oraclelogan 2009-08-05

打赏
举报

[Quote=引用楼主 manchun 的回复:]
SQL codeselect t.wp_status, t.is_send,count(1) "assignCount"from (select t1.wp_status, t1.is_sendfrom t_crm_wp t1whereEXISTS (SELECT area_idFROM t_area aWHERE area_id= t1.area_id
STARTWITH a.area_id='GZ0000000000'
CONNECTBY PRIOR a.area_id= a.parent_area_id)ANDNOTEXISTS
(select1from t_crm_wp twhere t.wp_id= t1.wp_idand (EXISTS
(select1from bst_sys_sysuser awhere (a.userid= t.user_idor
a.userid= t.check_id)andEXISTS
(select1from t_area cwhere a.area_id= c.area_idand c.area_id<>'GZ0000000000'
startwith c.area_id='GZ0000000000'
connectby prior c.parent_area_id= c.area_id)))and t.is_send='1')) tgroupby rollup(t.wp_status, t.is_send)

这个sql在本地执行需要1.6秒，在生产环境上是6分钟，怎么优化啊，我用的是ibatis，用##带变量，所以算是绑定变量方式传入的。

本地数据10万左右吧，生产库上的有几百万吧！
[/Quote]

你的主表是t_crm_wp，你说在本地数据有10多万，生产环境有几百万吧！

但是你的no exists嵌套查询：
not exists(select 1
from t_crm_wp t
where t.wp_id = t1.wp_id

这样一搞，不就是几百万*几百万的开销吗？这样很消耗资源的，建议把no exists里面对t_crm_wp的条件语句移到外面来，就是不要t_crm_wp*t_crm_wp 的遍历，否则就麻烦了！

当然以上几位说的connect by 用着where之后的条件判断也是比较消耗资源的啊！

sweetBug 2009-08-05

打赏
举报

学习 mark!

majy 2009-08-04

打赏
举报

你不把执行计划贴出来，没法搞啊，都不知道你的系统是怎么执行的，能调出啥东东来？？？

试试把这个语句改一下：
select t1.wp_status, t1.is_send
from t_crm_wp t1
where EXISTS (SELECT area_id
FROM t_area a
WHERE area_id = t1.area_id
START WITH a.area_id = 'GZ0000000000'
CONNECT BY PRIOR a.area_id = a.parent_area_id

－－－－－－
select t1.wp_status, t1.is_send
from t_crm_wp t1,
(SELECT area_id
FROM t_area a
START WITH a.area_id = 'GZ0000000000'
CONNECT BY PRIOR a.area_id = a.parent_area_id) t2
where t1.area_id = t2.area_id

下面的connect by语句也这么改

然后，在t_area表上加上字段 area_id的索引，我看，你用bitmap索引会非常好，你的area_id的distinct值会比较小，我想

做完索引，运行
dbms_stats.gather_table_stat(user,'T_AREA', method_opt=>'for all indexed columns', cascade=>true);

manchun 2009-08-04

打赏
举报

这个start with ... connect by ...不好拉出来，因为这是条件啊，条件就是需要满足在GZ000000节点下面的子节点就行，这样的话如果不用connect by遍历的话，如何搞定呢？

还有没有别的方法来优化做啊！

majy 2009-08-04

打赏
举报

1. 大数据量查询不要用绑定变量
2. 把你的生产环境的执行计划贴出来，这样是看不出什么来的
3. 不要管别他说什么 not exists效率低之类的说法 :-)

当然，你的语句里多少可以看出一点问题来：
你用了 start with...connect by，这个语句用在单表会运行的非常好，但是，你用在两表关联中就会有问题，所以，第一，你要做的就是，把start with...connect by 从内层关联中拉出来：

strart with ... connect by ...
(内层关联)

这样，你的问题基本上就解决了