VIEW PUSHED PREDICATE - A Powerful Optimizer Transformation feature

With this post I am focusing on powerful Optimizer Transformation feature - VIEW PUSHED PREDICATE by sharing my experience of tuning a query of a real application. Let's see the query and explain plan first.

SELECT *
  FROM (SELECT source_id,
               source_cd,
               latitude,
               longitude,
               ROW_NUMBER ()
               OVER (PARTITION BY source_id, source_cd ORDER BY rnk)
                  AS row_num
          FROM (SELECT source_id,
                       source_cd,
                       latitude,
                       longitude,
                       1 rnk
                  FROM standard_geo tmp,
                       geodata_state_map a,
                       geo_source_city b,
                       (  SELECT state_province_abrv, country_iso, COUNT (1)
                            FROM geodata_state_map
                        GROUP BY state_province_abrv, country_iso
                          HAVING COUNT (1) = 1) c,
                       (  SELECT cc_iso,
                                 full_name_city,
                                 full_name,
                                 COUNT (1)
                            FROM geo_source_city
                        GROUP BY cc_iso, full_name_city, full_name
                          HAVING COUNT (1) = 1) d
                 WHERE     a.state_province_abrv = c.state_province_abrv
                       AND a.country_iso = c.country_iso
                       AND a.state_province_abrv = tmp.geo_state_name
                       AND a.country_iso = tmp.geo_country_iso
                       AND a.adm_module = b.adm_module
                       AND a.full_name = b.full_name
                       AND tmp.geo_city_name = b.full_name_city
                       AND tmp.geo_country_iso = b.cc_iso
                       AND b.cc_iso = d.cc_iso
                       AND b.full_name_city = d.full_name_city
                       AND b.full_name = d.full_name
                UNION
                SELECT source_id,
                       source_cd,
                       latitude,
                       longitude,
                       2 rnk
                  FROM standard_geo tmp,
                       geo_source_city b,
                       (  SELECT cc_iso, full_name_city, COUNT (1)
                            FROM geo_source_city
                        GROUP BY cc_iso, full_name_city
                          HAVING COUNT (1) = 1) d
                 WHERE     tmp.geo_city_name = b.full_name_city
                       AND tmp.geo_country_iso = b.cc_iso
                       AND b.cc_iso = d.cc_iso
                       AND b.full_name_city = d.full_name_city)) tab1
 WHERE tab1.row_num = 1;

Execution Plan
--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                           |     2 |   126 |       |  5164   (1)| 00:00:02 |
|*  1 |  VIEW                                |                           |     2 |   126 |       |  5164   (1)| 00:00:02 |
|*  2 |   WINDOW SORT PUSHED RANK            |                           |     2 |   106 |       |  5164   (1)| 00:00:02 |
|   3 |    VIEW                              |                           |     2 |   106 |       |  5163   (1)| 00:00:02 |
|   4 |     SORT UNIQUE                      |                           |     2 |  1485 |       |  5163   (1)| 00:00:02 |
|   5 |      UNION-ALL                       |                           |       |       |       |            |          |
|*  6 |       HASH JOIN                      |                           |     1 |  1317 |       |  5153   (1)| 00:00:02 |
|   7 |        NESTED LOOPS                  |                           |     1 |   855 |       |    10  (10)| 00:00:01 |
|   8 |         NESTED LOOPS                 |                           |     1 |   855 |       |    10  (10)| 00:00:01 |
|*  9 |          HASH JOIN                   |                           |     1 |   812 |       |     7  (15)| 00:00:01 |
|* 10 |           HASH JOIN                  |                           |     1 |   643 |       |     4   (0)| 00:00:01 |
|  11 |            TABLE ACCESS FULL         | standard_geo              |     1 |   155 |       |     2   (0)| 00:00:01 |
|  12 |            TABLE ACCESS FULL         | geodata_state_map         |   784 |   373K|       |     2   (0)| 00:00:01 |
|  13 |           VIEW                       |                           |   784 |   129K|       |     3  (34)| 00:00:01 |
|* 14 |            FILTER                    |                           |       |       |       |            |          |
|  15 |             HASH GROUP BY            |                           |   784 |   129K|       |     3  (34)| 00:00:01 |
|  16 |              TABLE ACCESS FULL       | geodata_state_map         |   784 |   129K|       |     2   (0)| 00:00:01 |
|* 17 |          INDEX RANGE SCAN            | geo_source_city_idx4      |     1 |       |       |     2   (0)| 00:00:01 |
|* 18 |         TABLE ACCESS BY INDEX ROWID  | geo_source_city           |     1 |    43 |       |     3   (0)| 00:00:01 |
|  19 |        VIEW                          |                           | 31962 |    14M|       |  5143   (1)| 00:00:02 |
|* 20 |         FILTER                       |                           |       |       |       |            |          |
|  21 |          HASH GROUP BY               |                           | 31962 |   811K|   111M|  5143   (1)| 00:00:02 |
|  22 |           TABLE ACCESS FULL          | geo_source_city           |  3238K|    80M|       |  1127   (0)| 00:00:01 |
|* 23 |       FILTER                         |                           |       |       |       |            |          |
|  24 |        HASH GROUP BY                 |                           |     1 |   168 |       |     9  (23)| 00:00:01 |
|  25 |         NESTED LOOPS                 |                           |     1 |   168 |       |     7   (0)| 00:00:01 |
|  26 |          NESTED LOOPS                |                           |     1 |   154 |       |     5   (0)| 00:00:01 |
|  27 |           TABLE ACCESS FULL          | standard_geo              |     1 |   115 |       |     2   (0)| 00:00:01 |
|  28 |           TABLE ACCESS BY INDEX ROWID| geo_source_city           |     1 |    39 |       |     3   (0)| 00:00:01 |
|* 29 |            INDEX RANGE SCAN          | geo_source_city_idx1      |     1 |       |       |     2   (0)| 00:00:01 |
|* 30 |          INDEX RANGE SCAN            | geo_source_city_idx1      |     1 |    14 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------------------


Looking at the the plan, we can simply identify "HASH GROUP BY" on "GEO_SOURCE_CITY" table is main culprit. "GEO_SOURCE_CITY" table was used twice in each query of UNION as.
SELECT ...
       ...
 FROM
       geo_source_city b,
       (  SELECT cc_iso, full_name_city, COUNT (1)
            FROM geo_source_city
        GROUP BY cc_iso, full_name_city
          HAVING COUNT (1) = 1) d
WHERE  b.cc_iso = d.cc_iso
       AND b.full_name_city = d.full_name_city

It was not difficult to rewrite the "group by and filter" on this table by using "analytic function", so that the work can be done by single instance of the table in each query as.
SELECT *
  FROM (SELECT cc_iso,
               full_name_city,
               full_name,
               adm_module,
               latitude,
               longitude,
               COUNT (1) OVER (PARTITION BY cc_iso, full_name, full_name_city) cnt
          FROM geo_source_city)
 WHERE cnt = 1


Lets make these changes in the query and check the execution plan.

SELECT source_id,
       source_cd,
       latitude,
       longitude,
       ROW_NUMBER () OVER (PARTITION BY source_id, source_cd ORDER BY rnk) AS row_num
  FROM (SELECT source_id,
               source_cd,
               latitude,
               longitude,
               1 rnk
          FROM standard_geo tmp,
               geodata_state_map a,
               (  SELECT state_province_abrv, country_iso, COUNT(*)
                    FROM geodata_state_map
                GROUP BY state_province_abrv, country_iso
                  HAVING COUNT (*) = 1) c,
               (SELECT *
                  FROM (SELECT cc_iso,
                               full_name_city,
                               full_name,
                               adm_module,
                               latitude,
                               longitude,
                               COUNT (1) OVER (PARTITION BY cc_iso, full_name, full_name_city) cnt
                          FROM geo_source_city)
                 WHERE cnt = 1) d
         WHERE     a.state_province_abrv = c.state_province_abrv
               AND a.country_iso = c.country_iso
               AND a.state_province_abrv = tmp.geo_state_name
               AND a.country_iso = tmp.geo_country_iso
               AND a.adm_module = d.adm_module
               AND a.full_name = d.full_name
               AND tmp.geo_city_name = d.full_name_city
             AND tmp.geo_country_iso = d.cc_iso
        UNION
        SELECT source_id,
               source_cd,
               latitude,
               longitude,
               2 rnk
          FROM standard_geo tmp,
               (SELECT *
                  FROM (SELECT cc_iso,
                               full_name_city,
                               latitude,
                               longitude,
                               COUNT (1) OVER (PARTITION BY cc_iso, full_name_city) cnt
                          FROM geo_source_city)
                 WHERE cnt = 1) d
         WHERE     tmp.geo_city_name = d.full_name_city
               AND tmp.geo_country_iso = d.cc_iso);

Execution Plan
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                           |     2 |   106 |    20  (20)| 00:00:01 |
|   1 |  WINDOW SORT                        |                           |     2 |   106 |    20  (20)| 00:00:01 |
|   2 |   VIEW                              |                           |     2 |   106 |    19  (16)| 00:00:01 |
|   3 |    SORT UNIQUE                      |                           |     2 |  1011 |    19  (16)| 00:00:01 |
|   4 |     UNION-ALL                       |                           |       |       |            |          |
|*  5 |      HASH JOIN                      |                           |     1 |   865 |    11  (10)| 00:00:01 |
|   6 |       NESTED LOOPS                  |                           |     1 |   696 |     8   (0)| 00:00:01 |
|*  7 |        HASH JOIN                    |                           |     1 |   643 |     4   (0)| 00:00:01 |
|   8 |         TABLE ACCESS FULL           | standard_geo              |     1 |   155 |     2   (0)| 00:00:01 |
|   9 |         TABLE ACCESS FULL           | geodata_state_map         |   784 |   373K|     2   (0)| 00:00:01 |
|* 10 |        VIEW PUSHED PREDICATE        |                           |     1 |    53 |     4   (0)| 00:00:01 |
|  11 |         WINDOW BUFFER               |                           |     1 |    43 |     4   (0)| 00:00:01 |
|  12 |          TABLE ACCESS BY INDEX ROWID| geo_source_city           |     1 |    43 |     4   (0)| 00:00:01 |
|* 13 |           INDEX RANGE SCAN          | geo_source_city_idx4      |     1 |       |     3   (0)| 00:00:01 |
|  14 |       VIEW                          |                           |   784 |   129K|     3  (34)| 00:00:01 |
|* 15 |        FILTER                       |                           |       |       |            |          |
|  16 |         HASH GROUP BY               |                           |   784 |   129K|     3  (34)| 00:00:01 |
|  17 |          TABLE ACCESS FULL          | geodata_state_map         |   784 |   129K|     2   (0)| 00:00:01 |
|  18 |      NESTED LOOPS                   |                           |     1 |   146 |     6   (0)| 00:00:01 |
|  19 |       TABLE ACCESS FULL             | standard_geo              |     1 |   103 |     2   (0)| 00:00:01 |
|* 20 |       VIEW PUSHED PREDICATE         |                           |     1 |    43 |     4   (0)| 00:00:01 |
|  21 |        WINDOW BUFFER                |                           |     1 |    27 |     4   (0)| 00:00:01 |
|  22 |         TABLE ACCESS BY INDEX ROWID | geo_source_city           |     1 |    27 |     4   (0)| 00:00:01 |
|* 23 |          INDEX RANGE SCAN           | geo_source_city_idx1      |     1 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------


Awesome!!! Cost of the query just reduced to 20 from 5164. I wasn't really hoping so much performance benefits. So what really happened? lets revisit the "Execution Plan". We can see that "VIEW PUSHED PREDICATE" is done by Oracle.

"VIEW PUSHED PREDICATE" is something magical which says that Oracle has pushed down the join conditions (predicates) from main query inside the view (temp query block) which enabled view to be correlated and evaluated for each row of the outer query block using a NESTED LOOP. By this transformation we achieved this tremendous performance gain.

Bottom Line: Simplify your queries so that Oracle can perform these kind of transformation and our application may achieve great performance.

I hope you have enjoyed and learnt some thing from this article. You can post your feedback in comment-box.

Read More on Tuning SQL
- SQL Tuning: Partition Pruning and HASH JOIN - Real Life Example
- Tune Complete Refresh of Materialized View by atomic_refresh=>false parameter of dbms_mview.refresh
- Why my distinct query is not using index?
- Avoiding unnecessary function calls to optimize SQL statements
- Why do I prefer COALESCE over NVL
- Query optimization tips for Oracle
Read More on Tuning PL/SQL
- Reduce database calls by posting Multiple Records
- Append String to CLOB in Optimized way
- PLSQL Tuning: Bulk Collect with Dynamic SQL
- PLSQL Tuning: Bind Variables and execute immediate
Read More on Analyzing SQL & PL/SQL
- How to Get Execution Plan and Statistics of SQL Query
- DBMS_PROFILER: Overview and How to Install
- TKPROF: Generating and analyzing trace file

3 comments:

  1. Nice Explanation Sir.
    One QQ, What version we were using for Current SQL??

    As pushing with in Group By View is allowed from 11g version, same does not work for 10g.
    did we trying pushing predicate through some hints or same was not applicable for current case due to some restriction???

    ReplyDelete
    Replies
    1. My version is Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production. There is an hint called PUSH_PRED for it. check https://docs.oracle.com/cd/B12037_01/server.101/b10752/hintsref.htm#6174

      Delete