First of all, I want to thanks for the awesome response on "SQL Puzzle - Jobs taking more than 10 individual minutes in Parallel Jobs Environment" puzzle. What amused me more, was many people not only want to know the solution, but they wanted the step by step detail on how I solved it. With this blog, I am trying to explain the solution in a detailed way -
But first let us revisit out problem statement -
Find out all the jobs which took more than 10 non-parallel minutes individually to execute.
Little Explanation:
Ideally every job should run in parallel i.e. at any point of time multiple jobs should be running. We need to find out the job(s) which took more than 10 minutes individually. In other-words, during the job execution there were 10 minutes, in which only one job was executing, and there was no other running job.
Following is the table structure for the data
For complete Problem statement click here.
Step by step Solution Walk-through:
When the above problem statement came to me and I saw the data, I knew it would be very difficult to find the solution with current data-set. So I decided to transpose the data and generate the dynamic rows for each second each job was taking. For example if JOB A took 2 minutes, I would generate 120 rows for this.
So first I generated the rows for every second for longest running job.
Now I did a CROSS JOIN between Dynamically Generated Rows and JOBS data set, and did some arithmetic for get time execution time for each second in very row as RUN_TIME. Like if a job started at 12:00:01 and ended at 12:02:00, rows would be generated for each second for that job (12:00:01, 12:00:02, 12:00:03 ... 12:02:00).
Using the above SQL, if JOB A started at 12:00:01 and ended at 12:00:10 and JOB B started at 12:00:05 and ended at 12:00:12 following would be the output.
I hope by now, my next step should be a easy guess for you. Now we just need to first find the UNIQUE Time (no overlapping/parallel) and then COUNT the rows JOB Wise to get for how many seconds each JOB took individually where no other job was running. From above example it is 4 seconds for JOB A
So here is the final SQL -
As the problem statement was in minutes, I divided COUNT(*) by 60 to convert seconds in minutes and then compared it with 10 Minutes.
I hope you have liked my approach for approaching the solution. There might be better ways to solve this, If you have any alternate approach please do share in the comment box. Have fun with SQL, Enjoy :)
More SQL Puzzles:
- Transpose Rows and Shift Values among columns
- Graph Shortest Path Solution by SQL
- Diamond Shaped Star Pattern by SQL
- Sorting Versions stored in Varchar2 Column
- SQL Puzzle - Grouping Deals
- SQL Puzzle - Consecutive Wins
- SQL Puzzle - Issue Tracker
- SQL Interview Question Answers
But first let us revisit out problem statement -
Find out all the jobs which took more than 10 non-parallel minutes individually to execute.
Little Explanation:
Ideally every job should run in parallel i.e. at any point of time multiple jobs should be running. We need to find out the job(s) which took more than 10 minutes individually. In other-words, during the job execution there were 10 minutes, in which only one job was executing, and there was no other running job.
Following is the table structure for the data
create table nimish_jobs
(
job_name varchar2(30),
start_time date,
end_time date
);
For complete Problem statement click here.
Step by step Solution Walk-through:
When the above problem statement came to me and I saw the data, I knew it would be very difficult to find the solution with current data-set. So I decided to transpose the data and generate the dynamic rows for each second each job was taking. For example if JOB A took 2 minutes, I would generate 120 rows for this.
So first I generated the rows for every second for longest running job.
SELECT LEVEL N
FROM DUAL
CONNECT BY LEVEL <=
(SELECT MAX ((END_TIME - START_TIME) * 24*60*60) N FROM NIMISH_JOBS)
Now I did a CROSS JOIN between Dynamically Generated Rows and JOBS data set, and did some arithmetic for get time execution time for each second in very row as RUN_TIME. Like if a job started at 12:00:01 and ended at 12:02:00, rows would be generated for each second for that job (12:00:01, 12:00:02, 12:00:03 ... 12:02:00).
WITH
T AS
(
SELECT LEVEL N
FROM DUAL
CONNECT BY LEVEL <=
(SELECT MAX ((END_TIME - START_TIME) * 24*60*60) N FROM NIMISH_JOBS)
)
SELECT *
FROM
(
SELECT JOB_NAME,
N,
(END_TIME - START_TIME) * 24*60*60 RUN_SECONDS,
START_TIME + (N/(24*60*60)) - (1/(24*60*60)) RUN_TIME
FROM NIMISH_JOBS, T
)
WHERE N <= RUN_SECONDS + 1
Using the above SQL, if JOB A started at 12:00:01 and ended at 12:00:10 and JOB B started at 12:00:05 and ended at 12:00:12 following would be the output.
JOB A - 12:00:01
JOB A - 12:00:02
JOB A - 12:00:03
JOB A - 12:00:04
JOB A - 12:00:05
JOB B - 12:00:05
JOB A - 12:00:06
JOB B - 12:00:06
JOB A - 12:00:07
JOB B - 12:00:07
JOB A - 12:00:08
JOB B - 12:00:08
JOB A - 12:00:09
JOB B - 12:00:09
JOB A - 12:00:10
JOB B - 12:00:10
JOB B - 12:00:11
JOB B - 12:00:12
I hope by now, my next step should be a easy guess for you. Now we just need to first find the UNIQUE Time (no overlapping/parallel) and then COUNT the rows JOB Wise to get for how many seconds each JOB took individually where no other job was running. From above example it is 4 seconds for JOB A
JOB A - 12:00:01
JOB A - 12:00:02
JOB A - 12:00:03
JOB A - 12:00:04
So here is the final SQL -
WITH
T AS
(
SELECT LEVEL N
FROM DUAL
CONNECT BY LEVEL <=
(SELECT MAX ((END_TIME - START_TIME) * 24*60*60) N FROM NIMISH_JOBS)
),
JOB_RUN AS
(
SELECT *
FROM
(
SELECT JOB_NAME,
N,
(END_TIME - START_TIME) * 24*60*60 RUN_SECONDS,
START_TIME + (N/(24*60*60)) - (1/(24*60*60)) RUN_TIME
FROM NIMISH_JOBS, T
)
WHERE N <= RUN_SECONDS + 1
)
SELECT JOB_NAME, COUNT (*) SECONDS_ALONE
FROM
(
SELECT JOB_NAME,
RUN_TIME,
COUNT (*) OVER (PARTITION BY RUN_TIME) RT
FROM JOB_RUN
)
WHERE RT = 1
GROUP BY JOB_NAME
HAVING COUNT(*)/60 >= 10;
As the problem statement was in minutes, I divided COUNT(*) by 60 to convert seconds in minutes and then compared it with 10 Minutes.
I hope you have liked my approach for approaching the solution. There might be better ways to solve this, If you have any alternate approach please do share in the comment box. Have fun with SQL, Enjoy :)
More SQL Puzzles:
- Transpose Rows and Shift Values among columns
- Graph Shortest Path Solution by SQL
- Diamond Shaped Star Pattern by SQL
- Sorting Versions stored in Varchar2 Column
- SQL Puzzle - Grouping Deals
- SQL Puzzle - Consecutive Wins
- SQL Puzzle - Issue Tracker
- SQL Interview Question Answers
/*
ReplyDeleteIs [the total amount of Job K] 601 seconds?
*/
with
rec(JOB_NAME,now_TIME,END_TIME)as(
select JOB_NAME,START_TIME, END_TIME from NIMISH_JOBS union all
select JOB_NAME,now_TIME + (1/24/60/60), END_TIME from rec where now_TIME + (1/24/60/60) <= END_TIME),
seq(JOB_NAME,now_TIME)as(
select max(JOB_NAME),NOW_TIME from rec group by NOW_TIME having count(*)=1
)
select JOB_NAME,min(now_TIME) START_TIME, max(now_TIME) END_TIME,count(*) -1 seconds
from(
select JOB_NAME,now_TIME,now_TIME-row_number()over(order by NOW_TIME)/24/60/60 willgrp from seq
)
group by JOB_NAME,willgrp
order by 1,2,3,4
Hello Sir,
ReplyDeleteI am getting an error while trying to execute the below select statement
SELECT MDSYS.SDO_GEOM.SDO_BUFFER(SHAPE,850,0.01),shape from GISROUTESLINE WHERE ROUTEID=1234;
“SQL Error: ORA-29532: Java call terminated by uncaught Java exception: java.lang.NullPointerException
ORA-06512: at "MDSYS.SDO_GEOM", line 1326
ORA-06512: at "MDSYS.SDO_GEOM", line 1380
29532. 00000 - "Java call terminated by uncaught Java exception: %s"
*Cause: A Java exception or error was signaled and could not be
resolved by the Java code.
*Action: Modify Java code, if this behavior is not intended.”
I have done some R&D and found that this error only occurs in Oracle 11.2.0.4, not any other patch of 11g like (11.2.0.1, 11.2.0.2, 11.2.0.3) and not even in 12.1.0.1. Please the share your view in this select statement.
Thanks
This comment has been removed by the author.
ReplyDelete