如何不用hash比较实现数据仓库拉链算法

etsilence 2010-06-08 10:25:37
数据仓库中有种拉链算法来实现历史数据的存储,一般是通过hash比较来实现的,现在实际问题如下,请教各位,如果不用传统的拉链算法,还有什么别的办法实现这个功能。
member表,字段 member_id,member_status,start_date,end_date
A,enabled,2010-03-04,2010-03-05
A,diabled,2010-03-06,2010-03-08
A,tbd, 2010-03-09,2010-04-15
A,delete, 2010-04-16,2010-06-07
A,enabled,2010-06-08,3000-12-31
B,diabled,2010-03-02,2010-03-08
B,enabled,2010-03-09,2010-04-01
B,unknown,2010-06-08,2010-04-10
B,enabled,2010-04-11,2010-04-30
B,tbd, 2010-05-01,3000-12-31
product表,字段 product_id,member_id,product_status,start_date,end_date
001,A,online, 2010-03-04,2010-03-21
001,A,wait, 2010-03-22,2010-03-26
001,A,enabled,2010-03-26,2010-04-11
001,A,online, 2010-04-12,2010-06-05
001,A,online, 2010-06-06,3000-12-31
002,B,delete, 2010-03-02,2010-03-19
002,B,online, 2010-03-20,2010-04-12
002,B,delete, 2010-04-13,2010-06-03
002,B,enabled,2010-06-04,3000-12-31

意思就是每个会员每段时间的状态不一样,产品每段时间状态也不一样,现在通过member_id关联得到产品所属会员的状态信息,同样以时间段来区分product_status和member_status,就像两条时间轴交叉。结果要求如下:
product_id,product_status,member_id,member_status,start_date,end_date
001,online,A,enabled,2010-03-04,2010-03-05
001,online,A,diabled,2010-03-06,2010-03-08
001,online,A,tbd, 2010-03-09,2010-03-21
001,wait, A,tbd, 2010-03-22,2010-03-26
……

请问这个该如何实现?
建表脚本如下:
create table tab_member(member_id varchar2(20),member_status varchar2(20),start_date date,end_date date);
create table tab_product(product_id varchar2(20),member_id varchar2(20),product_status varchar2(20),start_date date,end_date date);

insert into tab_member values('A','enabled',to_date('2010-03-04','yyyy-mm-dd'),to_date('2010-03-05','yyyy-mm-dd'));
insert into tab_member values('A','diabled',to_date('2010-03-06','yyyy-mm-dd'),to_date('2010-03-08','yyyy-mm-dd'));
insert into tab_member values('A','tbd' ,to_date('2010-03-09','yyyy-mm-dd'),to_date('2010-04-15','yyyy-mm-dd'));
insert into tab_member values('A','delete' ,to_date('2010-04-16','yyyy-mm-dd'),to_date('2010-06-07','yyyy-mm-dd'));
insert into tab_member values('A','enabled',to_date('2010-06-08','yyyy-mm-dd'),to_date('3000-12-31','yyyy-mm-dd'));
insert into tab_member values('B','diabled',to_date('2010-03-02','yyyy-mm-dd'),to_date('2010-03-08','yyyy-mm-dd'));
insert into tab_member values('B','enabled',to_date('2010-03-09','yyyy-mm-dd'),to_date('2010-04-01','yyyy-mm-dd'));
insert into tab_member values('B','unknown',to_date('2010-06-08','yyyy-mm-dd'),to_date('2010-04-10','yyyy-mm-dd'));
insert into tab_member values('B','enabled',to_date('2010-04-11','yyyy-mm-dd'),to_date('2010-04-30','yyyy-mm-dd'));
insert into tab_member values('B','tbd' ,to_date('2010-05-01','yyyy-mm-dd'),to_date('3000-12-31','yyyy-mm-dd'));
insert into tab_product values('001','A','online' , to_date('2010-03-04','yyyy-mm-dd'),to_date('2010-03-21','yyyy-mm-dd'));
insert into tab_product values('001','A','wait' , to_date('2010-03-22','yyyy-mm-dd'),to_date('2010-03-26','yyyy-mm-dd'));
insert into tab_product values('001','A','enabled', to_date('2010-03-26','yyyy-mm-dd'),to_date('2010-04-11','yyyy-mm-dd'));
insert into tab_product values('001','A','online' , to_date('2010-04-12','yyyy-mm-dd'),to_date('2010-06-05','yyyy-mm-dd'));
insert into tab_product values('001','A','online' , to_date('2010-06-06','yyyy-mm-dd'),to_date('3000-12-31','yyyy-mm-dd'));
insert into tab_product values('002','B','delete' , to_date('2010-03-02','yyyy-mm-dd'),to_date('2010-03-19','yyyy-mm-dd'));
insert into tab_product values('002','B','online' , to_date('2010-03-20','yyyy-mm-dd'),to_date('2010-04-12','yyyy-mm-dd'));
insert into tab_product values('002','B','delete' , to_date('2010-04-13','yyyy-mm-dd'),to_date('2010-06-03','yyyy-mm-dd'));
insert into tab_product values('002','B','enabled', to_date('2010-06-04','yyyy-mm-dd'),to_date('3000-12-31','yyyy-mm-dd'));
commit;

...全文
231 3 打赏 收藏 转发到动态 举报
写回复
用AI写文章
3 条回复
切换为时间正序
请发表友善的回复…
发表回复
muyu7 2011-12-21
  • 打赏
  • 举报
回复
想看看答案
etsilence 2010-06-09
  • 打赏
  • 举报
回复
牛,LS的思想很简洁,结果应该也没问题,等我验证一下就OK。
mantisXF 2010-06-09
  • 打赏
  • 举报
回复
-- try it ..
SQL> select /*+ ordered use_hash(a b) */
2 b.product_id,
3 b.product_status,
4 b.member_id,
5 a.member_status,
6 greatest(a.start_date,b.start_date) as start_date,
7 least(a.end_date,b.end_date) as end_date
8 from tab_member a,
9 tab_product b
10 where a.member_id = b.member_id
11 and (a.start_date between b.start_date and b.end_date
12 or b.start_date between a.start_date and a.end_date
13 or a.end_date between b.start_date and b.end_date
14 or b.end_date between a.start_date and a.end_date)
15 order by 1,3,5,6
16 ;

PRODUCT_ID PRODUCT_STATUS MEMBER_ID MEMBER_STATUS START_DATE END_DATE
-------------------- -------------------- -------------------- -------------------- ----------- -----------
001 online A enabled 2010-3-4 2010-3-5
001 online A diabled 2010-3-6 2010-3-8
001 online A tbd 2010-3-9 2010-3-21
001 wait A tbd 2010-3-22 2010-3-26
001 enabled A tbd 2010-3-26 2010-4-11
001 online A tbd 2010-4-12 2010-4-15
001 online A delete 2010-4-16 2010-6-5
001 online A delete 2010-6-6 2010-6-7
001 online A enabled 2010-6-8 3000-12-31
002 delete B diabled 2010-3-2 2010-3-8
002 delete B enabled 2010-3-9 2010-3-19
002 online B enabled 2010-3-20 2010-4-1
002 online B enabled 2010-4-11 2010-4-12
002 delete B enabled 2010-4-13 2010-4-30
002 delete B tbd 2010-5-1 2010-6-3
002 enabled B tbd 2010-6-4 3000-12-31
002 enabled B unknown 2010-6-8 2010-4-10
002 online B unknown 2010-6-8 2010-4-10

18 rows selected

3,491

社区成员

发帖
与我相关
我的任务
社区描述
Oracle 高级技术相关讨论专区
社区管理员
  • 高级技术社区
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧