In past I posted some SQL Puzzle problems and their solutions. I got some suggestion that with solutions I should also describe the approach. So with this post I am trying to describe how I solved a problem.
Let's look at the problem first.
Imagine we have following table for tracking issues.
Data for our "issue_tracker" table is as following.
STATUS column may have 3 values: S (Start), U (Updates) and E (End). One thing to notice here is that issue 1002 was closed and was reopened again later.
So now here is the problem statement. You need to Find the START_DATE and END_DATE of each ISSUE_ID with number of updates happened. If an issue is reopened, it should have multiple records accordingly. Following is the sample output desired as per above data.
SOLUTION: To resolve this problem, we need to somehow divide rows in multiple groups so that each row of same issue starting from status 'S' to 'E' falls in same group. So I valued STATUS='S' as 1 (one) and used running total logic for this. Let's check first how I grouped each row for each issue.
Looks good!!! Now as you can see that we can easily divide above to 5 groups (using ISSUE_ID with RUNNING_TOTAL). Now we just need to find START_DATE, END_DATE and COUNT of UPDATE on each group. Complete solution can be done by following query.
Quite simple. I hope you have enjoyed this approach on this problem which might help you in similar problems. If you have any other approach or any comments please do post in comment area.
More SQL Puzzles:
- SQL Puzzle - Transpose Rows and Shift Values among columns
- Graph Shortest Path Solution by SQL
- Diamond Shaped Star Pattern by SQL
- Sorting Versions stored in Varchar2 Column
- SQL Puzzle - Grouping Deals
- SQL Puzzle - Consecutive Wins
- SQL Interview Question Answers
Let's look at the problem first.
Imagine we have following table for tracking issues.
create table issue_tracker
(
id number(10) primary key,
issue_id number(10),
assigned_to number(10),
status char(1),
issue_date date
);
Data for our "issue_tracker" table is as following.
insert into issue_tracker values(1, 1001, 1234, 'S', to_date('01-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(3, 1001, 1234, 'U', to_date('02-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(4, 1001, 1234, 'U', to_date('02-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(8, 1001, 1234, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(12, 1001, 1234, 'E', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(2, 1002, 2345, 'S', to_date('02-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(5, 1002, 2345, 'U', to_date('02-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(7, 1002, 2345, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(13, 1002, 2345, 'E', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(6, 1003, 1234, 'S', to_date('02-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(14, 1003, 1234, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(15, 1003, 1234, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(20, 1003, 1234, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(23, 1003, 1234, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(27, 1003, 1234, 'E', to_date('05-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(16, 1002, 2345, 'S', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(17, 1002, 2345, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(18, 1002, 2345, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(19, 1002, 2345, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(24, 1002, 2345, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(25, 1002, 2345, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(26, 1002, 2345, 'E', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(9, 1004, 1544, 'S', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(10, 1004, 1544, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(11, 1004, 1544, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(21, 1004, 1544, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(22, 1004, 1544, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
STATUS column may have 3 values: S (Start), U (Updates) and E (End). One thing to notice here is that issue 1002 was closed and was reopened again later.
So now here is the problem statement. You need to Find the START_DATE and END_DATE of each ISSUE_ID with number of updates happened. If an issue is reopened, it should have multiple records accordingly. Following is the sample output desired as per above data.
ISSUE_ID ASSIGNED_TO START_DATE UPDATES END_DATE
---------- ----------- ---------- ---------- ---------
1001 1234 01-JAN-15 3 03-JAN-15
1002 2345 02-JAN-15 2 03-JAN-15
1003 1234 02-JAN-15 4 05-JAN-15
1002 2345 03-JAN-15 5 04-JAN-15
1004 1544 03-JAN-15 4
SOLUTION: To resolve this problem, we need to somehow divide rows in multiple groups so that each row of same issue starting from status 'S' to 'E' falls in same group. So I valued STATUS='S' as 1 (one) and used running total logic for this. Let's check first how I grouped each row for each issue.
SELECT
issue_id,
assigned_to,
status,
issue_date,
case when status = 'S' then 1 else 0 end start_issue,
sum(case when status = 'S' then 1 else 0 end) over (partition by issue_id order by id) running_total
FROM issue_tracker
order by issue_id, id
ISSUE_ID ASSIGNED_TO STATUS ISSUE_DATE START_ISSUE RUNNING_TOTAL
---------- ----------- ------ ---------- ----------- -------------
1001 1234 S 01-JAN-15 1 1
1001 1234 U 02-JAN-15 0 1
1001 1234 U 02-JAN-15 0 1
1001 1234 U 03-JAN-15 0 1
1001 1234 E 03-JAN-15 0 1
1002 2345 S 02-JAN-15 1 1
1002 2345 U 02-JAN-15 0 1
1002 2345 U 03-JAN-15 0 1
1002 2345 E 03-JAN-15 0 1
1002 2345 S 03-JAN-15 1 2
1002 2345 U 03-JAN-15 0 2
1002 2345 U 03-JAN-15 0 2
1002 2345 U 03-JAN-15 0 2
1002 2345 U 04-JAN-15 0 2
1002 2345 U 04-JAN-15 0 2
1002 2345 E 04-JAN-15 0 2
1003 1234 S 02-JAN-15 1 1
1003 1234 U 03-JAN-15 0 1
1003 1234 U 03-JAN-15 0 1
1003 1234 U 04-JAN-15 0 1
1003 1234 U 04-JAN-15 0 1
1003 1234 E 05-JAN-15 0 1
1004 1544 S 03-JAN-15 1 1
1004 1544 U 03-JAN-15 0 1
1004 1544 U 03-JAN-15 0 1
1004 1544 U 04-JAN-15 0 1
1004 1544 U 04-JAN-15 0 1
Looks good!!! Now as you can see that we can easily divide above to 5 groups (using ISSUE_ID with RUNNING_TOTAL). Now we just need to find START_DATE, END_DATE and COUNT of UPDATE on each group. Complete solution can be done by following query.
SELECT issue_id,
assigned_to,
MIN (DECODE (status, 'S', issue_date)) start_date,
COUNT (DECODE (status, 'U', issue_date)) UPDATES,
MIN (DECODE (status, 'E', issue_date)) end_date
FROM (SELECT issue_id,
assigned_to,
status,
issue_date,
sum(case when status = 'S' then 1 else 0 end) over (partition by issue_id order by id) grp
FROM issue_tracker)
GROUP BY issue_id, assigned_to, grp
ORDER BY start_date, issue_id;
ISSUE_ID ASSIGNED_TO START_DATE UPDATES END_DATE
---------- ----------- ---------- ---------- ---------
1001 1234 01-JAN-15 3 03-JAN-15
1002 2345 02-JAN-15 2 03-JAN-15
1003 1234 02-JAN-15 4 05-JAN-15
1002 2345 03-JAN-15 5 04-JAN-15
1004 1544 03-JAN-15 4
5 rows selected.
Quite simple. I hope you have enjoyed this approach on this problem which might help you in similar problems. If you have any other approach or any comments please do post in comment area.
More SQL Puzzles:
- SQL Puzzle - Transpose Rows and Shift Values among columns
- Graph Shortest Path Solution by SQL
- Diamond Shaped Star Pattern by SQL
- Sorting Versions stored in Varchar2 Column
- SQL Puzzle - Grouping Deals
- SQL Puzzle - Consecutive Wins
- SQL Interview Question Answers
Nice solution, thanks for sharing
ReplyDeleteUsed properly, analytics functions rocks !
ReplyDeleteThe interesting case than we don't have this status mark (S/U/E). Will you interested in solving this case?
ReplyDeleteTabibitosan method could be useful in your case
Deletehttps://community.oracle.com/thread/1007478?tstart=0
Good job!
ReplyDeleteA note:
Probably I would always used a CASE statement instead a DECODE function: ;-)
SELECT issue_id
, assigned_to
, MIN(CASE WHEN status = 'S' THEN issue_date END) start_date
, COUNT(CASE WHEN status = 'U' THEN issue_date END) updates
, MIN(CASE WHEN status = 'E' THEN issue_date END) end_date
FROM (SELECT issue_id
, assigned_to
, status
, issue_date
, CASE WHEN status = 'S' THEN 1 ELSE 0 END start_issue
, SUM(CASE WHEN status = 'S' THEN 1 ELSE 0 END) OVER (PARTITION BY issue_id ORDER BY id) grp
FROM issue_tracker
)
GROUP BY issue_id, assigned_to, grp
ORDER BY start_date, issue_id;
Great things ....
ReplyDeleteSuper Sir, Thanks for your Post.. Great Work.
ReplyDeleteHi Nimish,
ReplyDeletethanks for the solution..
Below solution is quite long one -
select
issue_id
,assigned_to
,start_date
,updates
,end_date
from (
select
z.*,
min(issue_date) over (partition by frther_grp) START_DATE,
max(case when status != 'E' then null else issue_date end ) over (partition by frther_grp) END_DATE,
count(case when status = 'U' then 1 end) over
(partition by frther_grp ) UPDATES,
row_number() over (partition by frther_grp order by issue_id, id, issue_date, status) rwnm
from
(SELECT y.* ,
max (grp) over (order by issue_id, id, issue_date, status) frther_grp
FROM (SELECT x.*
,CASE
WHEN status = 'S' THEN
row_number() over (ORDER BY 2, 1, 5, 4)
END grp
FROM (SELECT id
,issue_id
,assigned_to
,status
,issue_date
FROM issue_tracker
-- WHERE issue_id in (1004, 1001)
ORDER BY 2
,1
,5
,4) x) y)z
)
where rwnm =1
order by start_date
I think there is one another method by which it can be solved i.e. using the match_pattern introduced in 12c
Regards,
Amarjot
Id ,Max(rank() over (partition by I'd order by start_date )) ,min(start_date) group by I'd .can we do like that ?
ReplyDeleteIt is almost the same as the model answer
ReplyDeleteI will contribute as a result of thinking hard.
I did not imitate it.
m(_ _)m
SELECT ISSUE_ID,ASSIGNED_TO
,MIN(CASE WHEN STATUS = 'S' THEN ISSUE_DATE END) START_DATE
,SUM(CASE WHEN STATUS = 'U' THEN 1 ELSE 0 END) UPDATES
,MAX(CASE WHEN STATUS = 'E' THEN ISSUE_DATE END) END_DATE
FROM (
SELECT ID,ISSUE_ID,ASSIGNED_TO,STATUS,ISSUE_DATE
,SUM(CASE WHEN STATUS = 'S' THEN 1 ELSE 0 END)OVER(ORDER BY ISSUE_ID,ASSIGNED_TO,ID) WILLGRP
FROM ISSUE_TRACKER)
GROUP BY ISSUE_ID,ASSIGNED_TO,WILLGRP
ORDER BY MIN(ISSUE_DATE),ISSUE_ID