Datatype is Important for Good Execution Plan and SQL Performance

In application programming, choosing the correct data-type is very important for the performance. It is also true with Oracle Database, a correct datatype for column in Oracle Database Table is very important for Good Execution Plan and SQL Performance. With this blog, I am trying to explain why it is important to store data in correct data type column on an Oracle Database Table.

Step 1: Creating a table, to store date values in VARCHAR2, NUMBER and DATE datatype.
SQL> CREATE TABLE MYTABLE
  2  (
  3  MYSTR VARCHAR2(100),
  4  MYNUM NUMBER,
  5  MYDT  DATE
  6  );
Table created.

SQL> INSERT INTO MYTABLE
  2  SELECT TO_CHAR(DT + (DBMS_RANDOM.value*3650),'YYYYMMDD') MYSTR,
  3         TO_NUMBER(TO_CHAR(DT+(DBMS_RANDOM.value*3650),'YYYYMMDD')) MYNUM,
  4     TRUNC(DT+(DBMS_RANDOM.value*3650)) MYDT
  5    FROM (SELECT TO_DATE('01-JAN-2010','DD-MM-YYYY') DT FROM DUAL) D,
  6         (SELECT ROWNUM FROM DUAL CONNECT BY LEVEL <= 999) A,
  7         (SELECT ROWNUM FROM DUAL CONNECT BY LEVEL <= 999) B
  8  ORDER BY DBMS_RANDOM.RANDOM;
998001 rows created.

SQL> commit;
Commit complete.


As you can see I have inserted 998,001 rows into MYTABLE. In MYSTR and MYNUM has date values in YYYYMMDD format.

Step 2: Create B-Tree indexes on all 3 columns and gather the stats of the table.
SQL> create index MYTABLE_MYSTR_idx on MYTABLE(MYSTR);
Index created.

SQL> create index MYTABLE_MYNUM_idx on MYTABLE(MYNUM);
Index created.

SQL> create index MYTABLE_MYDT_idx  on MYTABLE(MYDT);
Index created.

SQL> EXEC dbms_stats.gather_table_stats( NULL, 'MYTABLE');
PL/SQL procedure successfully completed.

Step 3: Now let's try to check the performance on each datatype for DATE type values by executing our SQL, which is only going to select 2 days of data.
SQL> SELECT * FROM MYTABLE
  2   WHERE MYSTR BETWEEN '20111231' AND '20120101';
520 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1015944200
-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         | 97688 |  2194K|  1064   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| MYTABLE | 97688 |  2194K|  1064   (2)| 00:00:01 |
-----------------------------------------------------------------------------

SQL> SELECT * FROM MYTABLE
  2   WHERE MYNUM BETWEEN 20111231 AND 20120101;
556 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1015944200
-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         | 97688 |  2194K|  1066   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| MYTABLE | 97688 |  2194K|  1066   (2)| 00:00:01 |
-----------------------------------------------------------------------------

SQL> SELECT * FROM MYTABLE
  2   WHERE MYDT BETWEEN TO_DATE('20111231','YYYYMMDD')
  3                  AND TO_DATE('20120101','YYYYMMDD');
569 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 684440235
--------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                  |   820 | 18860 |   798   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| MYTABLE          |   820 | 18860 |   798   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | MYTABLE_MYDT_IDX |   820 |       |     5   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

As you can see, Oracle Database did Full Table Scan for NUMBER and VARCHAR2 columns, but for DATE datatype Oracle Database performed INDEX RANGE SCAN. For Date datatype, Oracle Database is aware that in range of 20111231 and 20120101 dates actually includes only 2 dates, so it was able to perform INDEX RANGE SCAN. In cases of other datatype, for example numbers, Oracle estimated near 9000 values between 20111231 and 20120101, so it went for FULL TABLE SCAN.

I hope you have enjoyed reading this article. You might like to look into following SQL Optimization Articles -
- Efficient way to UPDATE bulk of records in Oracle Database
- Steps to debug SQL Performance issue in Oracle Database
- Foreign Key in Oracle Data Warehouse - Best Practice
- Why Primary Key Foreign Key Relationship and Join Elimination
- SQL Tuning: Partition Pruning and HASH JOIN - Real Life Example
- VIEW PUSHED PREDICATE - A Powerful Optimizer Transformation feature
- Why my distinct query is not using index?
- Reduce database calls by posting Multiple Records from Application to Database
- Avoiding unnecessary function calls to optimize SQL statements
- How to Get Execution Plan and Statistics of SQL Query
- Basic Query optimization tips for Oracle

1 comment: