SQL Puzzle - Grouping Deals

Thanks for the fair amount of response on SQL Puzzle - Consecutive Wins, and many of my friends asked me to continue the series of Oracle SQL Puzzle. This problem statement I am using here is taken from the OTN. So Lets first create the table and data for our problem:

Create Table Statement:
create table deal_details
(
  deal_id               number,
  contract_month        char(7),
  price                 number,
  quantity              number,
  total_quantity        number 
);

Insert Statements:
insert into deal_details values(2,'Jan2013',4,10000,310000);
insert into deal_details values(2,'Feb2013',4,10000,280000);
insert into deal_details values(2,'Mar2013',4.5,10000,310000);
insert into deal_details values(2,'Apr2013',4,10000,310000);
insert into deal_details values(2,'May2013',4,15000,280000);
insert into deal_details values(2,'Jun2013',4,15000,310000);

Our Current Data:
   DEAL_ID CONTRAC      PRICE   QUANTITY TOTAL_QUANTITY
---------- ------- ---------- ---------- --------------
         2 Jan2013          4      10000         310000
         2 Feb2013          4      10000         280000
         2 Mar2013        4.5      10000         310000
         2 Apr2013          4      10000         310000
         2 May2013          4      15000         280000
         2 Jun2013          4      15000         310000


Problem Statement:
If the deal key, price and quantity combination is same for successive months then club those rows into single row and show start_date as minimum contract_month, End_Date as maximum contract_month of that group, and total quantity as the sum of total_quantity column as below:

Expected Output:
   DEAL_ID START_D END_DAT      PRICE   QUANTITY TOTAL_QUANTITY
---------- ------- ------- ---------- ---------- --------------
         2 Jan2013 Feb2013          4      10000         590000
         2 Mar2013 Mar2013        4.5      10000         310000
         2 Apr2013 Apr2013          4      10000         310000
         2 May2013 Jun2013          4      15000         590000


You can put your answers on the comment box. Enjoy solving it :)

To view the solution of this puzzleClick Here.

More SQL Puzzles:
- SQL Puzzle - Transpose Rows and Shift Values among columns
- Graph Shortest Path Solution by SQL
- Diamond Shaped Star Pattern by SQL
- Sorting Versions stored in Varchar2 Column
- SQL Puzzle - Consecutive Wins
- SQL Puzzle - Issue Tracker
- SQL Interview Question Answers

14 comments:

  1. select deal_id, price, quantity, sum(total_quantity) as total_quantity, min(contract_month) as start_date, max(contract_month) as end_date
    from deal_details
    group by deal_id, price, quantity;

    ReplyDelete
    Replies
    1. Please check the output of your query. if it was so simple I would not have given it as puzzle :)

      Delete
  2. Hello kindly have a look. I also posted before but it doesn't appear might be due to approval. I tried this in PostgreSql.

    select deal_id, max(contract_month) max, min(contract_month) min , price, quantity, sum(total_quantity) as sum_total_quantity
    from (
    select deal_id, contract_month, to_char( to_date(contract_month, 'MonYYYY'), 'Q') as quarter, price, quantity, total_quantity
    from deal_details
    ) as tbl
    group by deal_id, price, quantity, quarter
    order by quarter, max

    Thanks for the exercise.

    ReplyDelete
    Replies
    1. With this data Output is correct, but our logic is not dependent on Quarter but it is based on "If the deal key, price and quantity combination is same for successive months". Try changing 4.5 of Mar2013 to 4. Output should be of 2 records.
      1) Jan2013 to Apr2013
      2) May2013 to Jun2013

      Delete
  3. And yes comments goes for moderation, usually I approve at every night :)

    ReplyDelete
  4. Thanks for give this sql detail. This is wonderful data for for interview purpose and That should be added in the list of sql interview question.

    ReplyDelete
  5. Hi, Fun!
    =============================================================================
    SELECT deal_id
    ,TO_CHAR(MIN(con_mth),'monYYYY') start_d
    ,TO_CHAR(MAX(con_mth),'monYYYY') end_dat
    ,price
    ,quantity
    ,SUM(total_quantity) total_quantity
    FROM
    (SELECT
    CASE
    WHEN lag(TO_CHAR(deal_id,'999999')||TO_CHAR(price,'99999.9')||TO_CHAR(quantity,'999999'))
    OVER (ORDER BY con_mth) = TO_CHAR(deal_id,'999999')||TO_CHAR(price,'99999.9')||TO_CHAR(quantity,'999999')
    THEN lag(rn) OVER (ORDER BY con_mth)
    ELSE rn
    END club
    ,con_mth
    ,deal_id
    ,price
    ,quantity
    ,total_quantity
    FROM
    (SELECT ROWNUM rn
    ,to_date(contract_month,'monYYYY') con_mth
    ,deal_id
    ,price
    ,quantity
    ,total_quantity
    FROM deal_details
    )
    )
    GROUP BY club
    ,deal_id
    ,price
    ,quantity
    =============================================================================

    ReplyDelete
  6. hmm. just realized my solution will fail for more than 2 rows in a row. Scratch that, thanks.

    fb

    ReplyDelete
  7. I am sure there are better ways, but this works..
    WITH row_comp AS
    (SELECT deal_id,
    to_date(contract_month,'MonYYYY') AS contract_month ,
    to_date(lead(contract_month) over
    (partition BY deal_id order by to_date(contract_month,'MonYYYY')),'MonYYYY')AS end_month,
    price,
    quantity,
    ABS(quantity - lead(quantity) over
    (partition BY deal_id order by to_date(contract_month,'MonYYYY'))) AS qty_diff,
    ABS(price - lead(price) over
    (partition BY deal_id order by to_date(contract_month,'MonYYYY'))) AS price_diff,
    lead(total_quantity) over
    (partition BY deal_id order by to_date(contract_month, 'MonYYYY')) total_quantity
    FROM deal_details a
    UNION-----get the first record in with the first_month duplicated always zeros for the difference calcs.
    SELECT deal_id,
    to_date(contract_month,'MonYYYY'),
    to_date(contract_month,'MonYYYY'),
    price,
    quantity,
    0,0,
    total_quantity
    FROM deal_details b
    WHERE to_date(contract_month,'MonYYYY') =
    (SELECT MIN(to_date(contract_month,'MonYYYY'))
    FROM deal_details c
    WHERE b.deal_id = c.deal_id
    )

    ) ,
    row_work AS --this creates the groups
    (SELECT A.* ,
    SUM(
    CASE
    WHEN qty_diff = 0
    AND price_diff = 0
    THEN 0
    ELSE 1
    END) over (partition BY deal_id order by contract_month) rnk
    FROM row_comp a
    WHERE end_month IS NOT NULL --get rid of the last row with a null in lead(end date)
    )
    SELECT deal_id,
    MIN(
    CASE
    WHEN qty_diff = 0
    AND price_diff = 0
    THEN contract_month
    ELSE end_month
    END) AS start_dt,
    MAX(end_month)AS end_dt,
    SUM(total_quantity) --, rnk --we don't have to show this
    FROM row_work
    GROUP BY deal_id,
    rnk --group by the ranking we created in the row_work factored sub-query
    ORDER BY 2

    ReplyDelete
  8. I tried this and it worked

    Basically the logic is constructed as follows like this

    Construct a grouping as follows

    1. Define a Group with a row_number if and only if there is a difference in the price or quantity from the previous value for the same deal_id

    OR

    2. If the row is the first row in the table(I mean to say there is no row previous to the current row) then include it as a new Group

    After that the column max_grp would contain values such there would be a value for the first contract_month in the group and the rest would of the entries in the group would contain nulls for the column max_grp

    The next block y is used to fill up these nulls with the previous not null value in the same group using the
    max(max_grp) over(order by dt_frmt) logic

    After this the block z is constructed by extracting the max and min dates for each grp_col.

    The outermost block is to remove the duplicates from the repeating elements of the same group.

    select * from(
    select deal_id
    ,to_char(min(dt_frmt) over(partition by grp_col),'MonYYYY') as start_dt
    ,to_char(max(dt_frmt) over(partition by grp_col),'MonYYYY')as end_dt
    ,price
    ,quantity
    ,sum(total_quantity) over(partition by grp_col) as tot_qty
    ,dt_frmt
    ,row_number() over(partition by grp_col order by 1) rnk1
    from(
    select max(max_grp) over(order by dt_frmt) grp_col
    ,y.*
    from(
    select case when (price<>prev_price
    or quantity<>prev_qty or rnk=1)
    then row_number() over(order by to_date('01'||contract_month,'DDMonYYYY'))
    end as max_grp
    ,x.*
    from(
    select lag(contract_month) over(partition by deal_id order by to_date('01'||contract_month,'DDMonYYYY') asc) as prev_dt
    ,lag(price) over(partition by deal_id order by to_date('01'||contract_month,'DDMonYYYY') asc) as prev_price
    ,lag(quantity) over(partition by deal_id order by to_date('01'||contract_month,'DDMonYYYY') asc) as prev_qty
    ,to_date('01'||contract_month,'DDMonYYYY') dt_frmt
    ,rank() over(partition by deal_id order by to_date('01'||contract_month,'DDMonYYYY') asc) rnk
    ,dt.*
    from deal_details dt
    )x
    )y
    )z
    order by dt_frmt asc


    Thanks for the SQL puzzle. It was tough one for me. If you find any holes in logic let me know

    ReplyDelete


  9. select deal_id, start_month, end_month, price, quantity,
    total_quantity
    from (select ranks, deal_id,
    min(contract_month) as end_month,
    max(contract_month) as start_month,
    max(price) price, max(quantity) quantity,
    sum(total_quantity) total_quantity, diff
    from (select ranks, deal_id, contract_month, price,
    quantity, total_quantity, naya,
    to_date(prev),
    nvl(months_between(naya, prev), 1) as diff
    from (select ranks, deal_id, contract_month,
    price, quantity, total_quantity,
    to_date(contract_month, 'monyyyy') naya,
    lag(dates) over(partition by deal_id order by ranks) prev
    from (select deal_id, contract_month,
    price, quantity,
    total_quantity,
    to_date(contract_month, 'monyyyy') dates,
    dense_rank() over(partition by deal_id order by price, quantity) ranks
    from deal_details
    order by ranks,
    to_date(contract_month, 'monyyyy')))
    order by ranks)
    group by ranks, deal_id, diff);

    ReplyDelete
  10. WITH
    DEAL_DETAILS(DEAL_ID, CONTRACT_MONTH, PRICE, QUANTITY, TOTAL_QUANTITY)AS(
    SELECT 2,'Jan2013',4,10000,310000 FROM DUAL UNION ALL
    SELECT 2,'Feb2013',4,10000,280000 FROM DUAL UNION ALL
    SELECT 2,'Mar2013',4.5,10000,310000 FROM DUAL UNION ALL
    SELECT 2,'Apr2013',4,10000,310000 FROM DUAL UNION ALL
    SELECT 2,'May2013',4,15000,280000 FROM DUAL UNION ALL
    SELECT 2,'Jun2013',4,15000,310000 FROM DUAL),
    WORK_VIEW1(DEAL_ID, PRICE, QUANTITY, TOTAL_QUANTITY, CONTRACT_MONTH)AS(
    SELECT DEAL_ID, PRICE, QUANTITY, TOTAL_QUANTITY
    ,TO_DATE(CONTRACT_MONTH, 'MONYYYY', 'NLS_DATE_LANGUAGE=AMERICAN')
    FROM DEAL_DETAILS),
    WORK_VIEW2(DEAL_ID, CONTRACT_MONTH, PRICE, QUANTITY, TOTAL_QUANTITY, RN1, RN2)AS(
    SELECT DEAL_ID, CONTRACT_MONTH, PRICE, QUANTITY, TOTAL_QUANTITY
    ,ROW_NUMBER()OVER(PARTITION BY DEAL_ID ORDER BY CONTRACT_MONTH)
    ,ROW_NUMBER()OVER(PARTITION BY PRICE,QUANTITY ORDER BY CONTRACT_MONTH)
    FROM WORK_VIEW1)
    SELECT
    DEAL_ID
    ,INITCAP(TO_CHAR(MIN(CONTRACT_MONTH), 'MONYYYY', 'NLS_DATE_LANGUAGE=AMERICAN')) START_D
    ,INITCAP(TO_CHAR(MAX(CONTRACT_MONTH), 'MONYYYY', 'NLS_DATE_LANGUAGE=AMERICAN')) END_DAT
    ,MAX(PRICE) PRICE
    ,MAX(QUANTITY) QUANTITY
    ,SUM(TOTAL_QUANTITY) TOTAL_QUANTITY
    FROM WORK_VIEW2
    GROUP BY DEAL_ID, RN1 - RN2
    ORDER BY DEAL_ID, MIN(CONTRACT_MONTH)

    ReplyDelete
  11. Oracle 12c and upwards supports MATCH_RECOGNIZE a truly magical extension to sql for complex analytical calculations.


    Part 1) Solution to problem statement

    with t as
    (
    select 2 deal_id, 'Jan2013' contract_month, 4 price, 10000 quantity, 310000 total_quantity from dual
    union all
    select 2 deal_id, 'Feb2013' contract_month, 4 price, 10000 quantity, 280000 total_quantity from dual
    union all
    select 2 deal_id, 'Mar2013' contract_month, 4.5 price, 10000 quantity, 310000 total_quantity from dual
    union all
    select 2 deal_id, 'Apr2013' contract_month, 4 price, 10000 quantity, 310000 total_quantity from dual
    union all
    select 2 deal_id, 'May2013' contract_month, 4 price, 15000 quantity, 280000 total_quantity from dual
    union all
    select 2 deal_id, 'Jun2013' contract_month, 4 price, 15000 quantity, 310000 total_quantity from dual
    ),
    Q1 as (
    select
    deal_id, contract_month,
    to_number(to_char(to_date(contract_month, 'MonYYYY'), 'YYYY')) as YearId, to_number(to_char(to_date(contract_month, 'MonYYYY'), 'MM')) as MonthId,
    price, quantity, total_quantity
    from t
    )
    select
    MR1.deal_id,
    MR1.contract_month_strt as start_date,
    nvl(MR1.contract_month_end_mm, MR1.contract_month_strt) as end_date,
    MR1.price, MR1.quantity,
    MR1.total_quantity_strt + nvl(MR1.total_quantity_mm,0) as total_quantity
    from Q1
    MATCH_RECOGNIZE (
    partition by deal_id, price, quantity
    order by yearid, monthid
    measures
    strt.contract_month as contract_month_strt,
    strt.total_quantity as total_quantity_strt,
    last(mergemonth.contract_month) as contract_month_end_mm,
    sum(mergemonth.total_quantity) as total_quantity_mm
    ONE ROW PER MATCH
    AFTER MATCH SKIP PAST LAST ROW
    PATTERN (strt mergemonth*)
    DEFINE
    mergemonth as (yearid*12) + monthid = (prev(yearid)*12) + prev(monthid) + 1
    ) MR1
    order by to_date(MR1.contract_month_strt, 'MonYYYY')
    ;

    gives

    DEAL_ID START_D END_DAT PRICE QUANTITY TOTAL_QUANTITY
    ---------- ------- ------- ---------- ---------- --------------
    2 Jan2013 Feb2013 4 10000 590000
    2 Mar2013 Mar2013 4.5 10000 310000
    2 Apr2013 Apr2013 4 10000 310000
    2 May2013 Jun2013 4 15000 590000


    Part 2) Modified requirements ... try changing 4.5 of Mar2013 to 4.

    Same SQL, output becomes:

    DEAL_ID START_D END_DAT PRICE QUANTITY TOTAL_QUANTITY
    ---------- ------- ------- ---------- ---------- --------------
    2 Jan2013 Apr2013 4 10000 1210000
    2 May2013 Jun2013 4 15000 590000


    cheers
    Shankar

    ReplyDelete
    Replies
    1. Note: I broke out contractMonth into Year and Month components to cover for use cases where the contractMonth group of rows to be merged spans across years.. i.e from Oct2013 to May2014 (say).

      Delete