Efficient way to UPDATE bulk of records in Oracle Database

I have received many requests to write on what is the efficient way to UPDATE bulk of records in Oracle Database. With is blog I am trying to compare various ways to UPDATE one table (ROSTER) from another table (EMPLOYEE). Both tables has 999999 records and there is a PRIMARY KEY on EMPLOYEE_CODE in both tables.

With this blog I am covering following methods to UPDATE data-
1. Update each record individually and COMMIT in FOR LOOP
2. Update each record individually in FOR LOOP but COMMIT after the loop
3. BULK UPDATE using BULK COLLECT and FOR ALL
4. DIRECT UPDATE SQL
5. MERGE STATEMENT
6. UPDATE using INLINE View Method

Lets execute all and check the performance

1. Update each record individually and COMMIT in FOR LOOP
nimish@garg> begin
  2  for c in (select * from employee)
  3  loop
  4    update roster
  5    set
  6     job = c.job,
  7     position = c.position,
  8     organisation = c.organisation
  9    where
 10      employee_code = c.employee_code;
 11    commit;
 12  end loop;
 13  end;
 14  /

PL/SQL procedure successfully completed.

Elapsed: 00:03:52.36

2. Update each record individually in FOR LOOP but COMMIT after the loop
nimish@garg> begin
  2  for c in (select * from employee)
  3  loop
  4    update roster
  5    set
  6     job = c.job,
  7     position = c.position,
  8     organisation = c.organisation
  9    where
 10      employee_code = c.employee_code;
 11  end loop;  
 12  commit;
 13  end;
 14  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:39.53

3. BULK UPDATE using BULK COLLECT and FOR ALL
nimish@garg> declare
  2      type employee_t is table of employee%rowtype index by pls_integer;
  3      l_employee_data   employee_t;
  4   begin
  5      select *
  6        bulk collect into l_employee_data
  7        from employee;
  8
  9      forall indx in 1 .. l_employee_data.count
 10         update roster r
 11            set r.job = l_employee_data(indx).job,
 12                r.position = l_employee_data(indx).position,
 13                r.organisation = l_employee_data(indx).organisation
 14          where r.employee_code = l_employee_data(indx).employee_code;
 15
 16      commit;
 17  end;
 18  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:43.53

4. DIRECT UPDATE SQL
nimish@garg> begin
  2  update roster r
  3     set (job, position, organisation) = (
  4          select job, position, organisation from employee e 
  5           where e.employee_code = r.employee_code
  6        )
  7  where r.employee_code in (select employee_code from employee);
  8  commit;
  9  end;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:38.22

5. MERGE STATEMENT
nimish@garg> begin
  2  merge into roster r
  3  using employee e
  4  on (r.employee_code = e.employee_code)
  5  when matched then
  6  update set
  7     r.job = e.job,
  8     r.position = e.position,
  9     r.organisation = e.organisation;
 10
 11  commit;
 12  end;
 13  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:26.23

6. UPDATE using INLINE View Method
nimish@garg> begin
  2  update
  3  (
  4    select r.employee_code, r.job, r.position, r.organisation,
  5           e.job jval, e.position pval, e.organisation oval
  6      from roster r, employee e
  7     where r.employee_code = e.employee_code
  8  )
  9  set job = jval,
 10      position = pval,
 11      organisation = oval;
 12
 13  commit;
 14  end;
 15  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:26.97

Key Take Away :-
A common misconception of many PL/SQL developer is that BULK COLLECT + FOR ALL is always better when we are doing a bulk operation. I always say BULK COLLECT + FOR ALL is better than the operation in loop, but if we can write the logic in a flat UPDATE SQL, a direct UPDATE SQL would be almost always better than the BULK COLLECT + FOR ALL. I also suggest to wrap the SQL code in a PL/SQL API.

I have not covered BULK COLLECT + FOR ALL with LIMIT clause here, as there would be some context switches and it would be little slower (very marginal depending on LIMIT Size) than BULK COLLECT + FOR ALL.

As we can see in above examples, "MERGE STATEMENT" or "UPDATE using INLINE View Method" are usually the fastest way to bulk update data, followed by a direct UPDATE STATEMENT. I usually prefer "MERGE STATEMENT" over other approaches.

Related Posts -
- Avoiding unnecessary function calls to optimize SQL statements
- Bulk Collect with Limit Clause and %NOTFOUND
- PLSQL Tuning: Bulk Collect with Dynamic SQL
- Execute Immediate Returning Bulk Collect - Dynamic DML Collection
- ORA-01555: snapshot too old
- ORA-01779: cannot modify a column which maps to a non key-preserved table

32 comments:

  1. Try it with a parallel pipelined table function and e.g. forall ...

    ReplyDelete
  2. Nice post on update. I have one scenario where in I have soft PK. Like in one table of 50 columns for some rows PK(col1,col4,col8) and others PK(col1,col5,col7,col11). PK is defined in one metadata table. When I do (I/U/D) I first make immediate execute statement. Then run it for staged records. It takes lot of time. Let me know any suggestions.

    ReplyDelete
    Replies
    1. create a calculated column in your table to drive PK, may be a concatenated string or something as per business requirement

      Delete
  3. Very nice blog. I've also used parallel PL/SQL tables for very high volume bulk operations that run at consistent intervals (like data warehousing) throughout the day. I get really good performance from them. One issue that I seem to run into with Merges is the inconsistency in its behavior. I tend to have to run statistics just to keep the plan stabilized. I guess the trade-off is that if you want more inline type solutions, make sure don't have stale statistics if you are working with VLDB objects.

    ReplyDelete
  4. Get Instant help from our expert for your Database issues or to Optimize your SQL Script. We will be charging on a hourly basis. You can initiate the chat with our expert for the best and fast solution.

    Site URL: https://sqloptimize.com

    ReplyDelete
  5. In your 3rd case, when employeed code doesn't match, then how will you update other rows. My requirement is, first it should update which matches then, it should update when it doesn't matches. Any ways?

    ReplyDelete
  6. Excellent information and thank you for sharing it with all.

    ReplyDelete
  7. I want to update 8.5Million Records in a column and i'm using Forall concept even then it's taking hours of time. Can you provide a solution that i can update the records in that table in less than 5mins?

    ReplyDelete
    Replies
    1. Did you try the merge statement?

      Delete
  8. thank you so much for this information .. it really helped me ..

    ReplyDelete
  9. I have update statement actually contains 90 million records and it is running from past 3 hours with no completion.

    Kindly suggest the better approach to make this faster.

    UPDATE doc_migration rdm
    SET
    ( rdm.documentum_latest_version_doc_id,
    rdm.document_version ) = (
    SELECT
    r_object_id,
    r_version_label
    FROM
    doc_latest_objectid dlo
    WHERE
    rdm.doc_id = dlo.i_chronicle_id
    )
    WHERE
    EXISTS (
    SELECT
    i_chronicle_id
    FROM
    doc_latest_objectid dl
    WHERE
    rdm.doc_id = dl.i_chronicle_id
    );

    ReplyDelete
  10. Indeed it was very useful read and experience. The tips and coding exploration you did here was super fascinating. I was actually looking for some contents related to data science course in nagpur. But seriously i enjoyed your brilliant job.

    ReplyDelete
  11. Get the best honda acura engines for sale, honda engines for sale, jdm lexus engine & trnsmission for sale, and Jdm Ej25 engine at best price. Visit our website for more information.
    Honda Acura Engines for Sale

    ReplyDelete
  12. Hello Experts , I am trying to update one column in a huge fact table by getting data from source Its taking huge time because i dont have any business keys associated to it

    ReplyDelete
  13. DECLARE
    CURSOR costed_id_curs IS
    SELECT
    *
    FROM
    costed_event_src t1
    WHERE
    t1.event_date >= TO_DATE('01-JUL-2023 ', 'dd-mon-yyyy')
    AND t1.event_date < TO_DATE('1-AUG-2023', 'dd-mon-yyyy');

    v_counter INTEGER := 0;
    BEGIN
    FOR SRC IN costed_id_curs LOOP
    UPDATE costed_event_tgt t2
    SET
    t2.fixed_charge = SRC.fixed_charge
    WHERE
    t2.account_num = SRC.account_num
    AND t2.event_seq = SRC.event_seq
    AND t2.event_source = SRC.event_source
    AND t2.event_type_id = SRC.event_type_id
    AND t2.event_date = SRC.event_date;

    v_counter := v_counter + 1;
    IF MOD(v_counter, 1000) = 0 THEN
    COMMIT;
    END IF;
    END LOOP;
    COMMIT;
    dbms_output.put_line('total number of rows commited: ' || v_counter);
    END;



    ReplyDelete
  14. Any help would be appreciated

    ReplyDelete
  15. I like your blog its realy usefull ...if you want to know about data science so you can visit:How to Choose the Right Data Science Course for Your Skill Level and Learning Style

    ReplyDelete
  16. Thank you for sharing your thoughts! I'm delighted that you find the content valuable and enjoyable. Your encouragement motivates me to continue delivering high-quality and engaging information.
    Visit: PYTHON FOR ETHICAL HACKING: LEARN TO DEFEND AND SECURE

    ReplyDelete
  17. pg slot รับฟรีเครดิต ผู้ให้บริการเกมสล็อตออนไลน์ ที่มาแรงที่สุด pg slot รับรองความคุ้มราคาจากถ้าเกิดคูณเริ่มจะมีความรู้สึกว่าของฟรีไม่มีในโลก บอกเลยว่าคุณคิดผิดแล้ว ด้วยเหตุว่าเว็บไซต์ของเราแจกจริงไม่ต้องแชร์

    ReplyDelete