There are times when we need to densify the sparse data i.e. fill the gaps in a table. For example we have a SALES table which has all the product and date wise records of sales, but it don't contain the the records that had no sales on a given date. Now the requirement is to have a date wise list of all products with total count of products and amount, if product was not sold on a date, it should come but total count & total amount of the product for that date should be zero.
Reference: Oracle Documentation
Lets first create data to have better understanding
As you can see in above output, we have date wise data for all the products, but some of the product and sale date combinations are missing as there was no records for them in sales tables. To overcome the problem of sparsity, you can use a partition outer join to fill the gaps for this dimension.
This looks like a cross join query, but it is much more efficient here because the SALES table is selected only once. The SDATE column is selected even when there is no matching row in SALES.
ANSI syntax for SQL in ORACLE is recommended by Oracle. It supports various new features like outer joining with more than one table and as we see Partitioned Outer Joins, but ANSI syntax also comes with several limitations, major one is that it cannot be used in fast refreshable materialized views.
Related Posts:
- FIRST_VALUE and LAST_VALUE with Windowing Clause
- Analytic Functions: Rank, Dense_Rank, Row_Number
- LAG and LEAD - Analytic Function
- Getting Cumulative Sum (Running Total) Using Analytical Functions
- New Features for Developers in Oracle 11g
- Oracle Database 12c New Features for Developers
Reference: Oracle Documentation
Lets first create data to have better understanding
create table product
(
id number(10),
name varchar2(100)
);
create table sales
(
pid number(5),
sdate date,
samount number(10,2)
);
insert into product values (1, 'Mouse');
insert into product values (2, 'Keyboard');
insert into product values (3, 'Monitor');
insert into product values (4, 'Speaker');
insert into product values (5, 'CPU');
insert into sales values(1, to_date('01-Jan-2014','dd-mon-yyyy'), 100);
insert into sales values(2, to_date('01-Jan-2014','dd-mon-yyyy'), 500);
insert into sales values(4, to_date('01-Jan-2014','dd-mon-yyyy'), 1200);
insert into sales values(5, to_date('01-Jan-2014','dd-mon-yyyy'), 10000);
insert into sales values(4, to_date('01-Jan-2014','dd-mon-yyyy'), 1200);
insert into sales values(1, to_date('02-Jan-2014','dd-mon-yyyy'), 100);
insert into sales values(3, to_date('02-Jan-2014','dd-mon-yyyy'), 8000);
insert into sales values(4, to_date('02-Jan-2014','dd-mon-yyyy'), 1200);
insert into sales values(5, to_date('02-Jan-2014','dd-mon-yyyy'), 10000);
insert into sales values(1, to_date('02-Jan-2014','dd-mon-yyyy'), 100);
insert into sales values(1, to_date('03-Jan-2014','dd-mon-yyyy'), 100);
insert into sales values(2, to_date('03-Jan-2014','dd-mon-yyyy'), 500);
insert into sales values(3, to_date('03-Jan-2014','dd-mon-yyyy'), 8000);
insert into sales values(3, to_date('03-Jan-2014','dd-mon-yyyy'), 8000);
insert into sales values(4, to_date('03-Jan-2014','dd-mon-yyyy'), 1200);
insert into sales values(3, to_date('04-Jan-2014','dd-mon-yyyy'), 8000);
insert into sales values(4, to_date('04-Jan-2014','dd-mon-yyyy'), 1200);
insert into sales values(1, to_date('04-Jan-2014','dd-mon-yyyy'), 100);
insert into sales values(2, to_date('04-Jan-2014','dd-mon-yyyy'), 500);
insert into sales values(3, to_date('04-Jan-2014','dd-mon-yyyy'), 8000);
insert into sales values(4, to_date('04-Jan-2014','dd-mon-yyyy'), 1200);
insert into sales values(2, to_date('05-Jan-2014','dd-mon-yyyy'), 500);
insert into sales values(3, to_date('05-Jan-2014','dd-mon-yyyy'), 8000);
insert into sales values(2, to_date('05-Jan-2014','dd-mon-yyyy'), 500);
insert into sales values(3, to_date('05-Jan-2014','dd-mon-yyyy'), 8000);
insert into sales values(4, to_date('05-Jan-2014','dd-mon-yyyy'), 1200);
select
name,
sdate,
sum(samount)
from
product
left outer join
sales
on (sales.pid = product.id)
group by
name,
sdate
order by sdate, name
NAME SDATE SUM(SAMOUNT)
------------------------------ --------- ------------
CPU 01-JAN-14 10000
Keyboard 01-JAN-14 500
Mouse 01-JAN-14 100
Speaker 01-JAN-14 2400
CPU 02-JAN-14 10000
Monitor 02-JAN-14 8000
Mouse 02-JAN-14 200
Speaker 02-JAN-14 1200
Keyboard 03-JAN-14 500
Monitor 03-JAN-14 16000
Mouse 03-JAN-14 100
Speaker 03-JAN-14 1200
Keyboard 04-JAN-14 500
Monitor 04-JAN-14 16000
Mouse 04-JAN-14 100
Speaker 04-JAN-14 2400
Keyboard 05-JAN-14 1000
Monitor 05-JAN-14 16000
Speaker 05-JAN-14 1200
As you can see in above output, we have date wise data for all the products, but some of the product and sale date combinations are missing as there was no records for them in sales tables. To overcome the problem of sparsity, you can use a partition outer join to fill the gaps for this dimension.
select
name,
sdate,
sum(nvl(samount,0))
from
product
left outer join
sales
PARTITION BY (SDATE)
on (sales.pid = product.id)
group by
name,
sdate
order by sdate, name
NAME SDATE SUM(NVL(SAMOUNT,0))
---------------------------- --------- -------------------
CPU 01-JAN-14 10000
Keyboard 01-JAN-14 500
Monitor 01-JAN-14 0
Mouse 01-JAN-14 100
Speaker 01-JAN-14 2400
CPU 02-JAN-14 10000
Keyboard 02-JAN-14 0
Monitor 02-JAN-14 8000
Mouse 02-JAN-14 200
Speaker 02-JAN-14 1200
CPU 03-JAN-14 0
Keyboard 03-JAN-14 500
Monitor 03-JAN-14 16000
Mouse 03-JAN-14 100
Speaker 03-JAN-14 1200
CPU 04-JAN-14 0
Keyboard 04-JAN-14 500
Monitor 04-JAN-14 16000
Mouse 04-JAN-14 100
Speaker 04-JAN-14 2400
CPU 05-JAN-14 0
Keyboard 05-JAN-14 1000
Monitor 05-JAN-14 16000
Mouse 05-JAN-14 0
Speaker 05-JAN-14 1200
This looks like a cross join query, but it is much more efficient here because the SALES table is selected only once. The SDATE column is selected even when there is no matching row in SALES.
ANSI syntax for SQL in ORACLE is recommended by Oracle. It supports various new features like outer joining with more than one table and as we see Partitioned Outer Joins, but ANSI syntax also comes with several limitations, major one is that it cannot be used in fast refreshable materialized views.
Related Posts:
- FIRST_VALUE and LAST_VALUE with Windowing Clause
- Analytic Functions: Rank, Dense_Rank, Row_Number
- LAG and LEAD - Analytic Function
- Getting Cumulative Sum (Running Total) Using Analytical Functions
- New Features for Developers in Oracle 11g
- Oracle Database 12c New Features for Developers
Good use of ansi sql
ReplyDeleteBeneficial tips & steps! I've gone through the functions and detailed process. Really its awesome & very beneficial for essential individuals!
ReplyDeleteThanks for sharing such a detailed and informative post! Oracle DBA Services
I've never seen this option used, thanks!
ReplyDeleteThis may be stating the obvious, but this would still not report anything if there were NO SALES at all for a given day. That's probably never going to happen in this sales example, but...
may be then you have to use a table with all the dates !!!
Deletecan u explain how this query is executing??
ReplyDelete