Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle update production database

Scenario:

  • I have a huge .csv file (million of lines) .
  • With sqlldr (SQL Loader) I have to create a temporary table with all the data in the CSV.
  • After this I have to do some processing on the temporary table (uppercase update some columns, etc.).
  • After processing, I have to take every row from the temporary table, make some additional checks and insert those rows in another table (being heavily used in production) .

How do you suggest to make all this processing, so that I won't affect the overall performance of the production environment ?

(NOTE: I am not supposed to pre-process the .csv before hand).

Any suggestion will be highly appreciated !

like image 994
Andrei Ciobanu Avatar asked Nov 22 '25 09:11

Andrei Ciobanu


2 Answers

I know you've said you want to use SQL Loader, but you might want to look at using an external table as it might make things easier. You could declare your external table as something like

create table EXTERNAL_HR_DATA (
    EMPNO    NUMBER(4),
    ENAME    VARCHAR2(10),
    JOB      VARCHAR2(9),
    MGR      NUMBER(4),
    HIREDATE DATE,
    SAL      NUMBER(7,2),
    COMM     NUMBER(7,2),
    DEPTNO   NUMBER(2))
    Organization external
        (type oracle_loader
         default directory testdir
         access parameters (records delimited by newline
                            fields terminated by ‘,’)
         location (‘emp_ext.csv’))
    reject limit 1000;

This would allow you to read (but not change) the data in your file using standard SELECT statements. You could then SELECT from the external table and INSERT the data into your 'temp' table directly, doing at least some of the editing during the INSERT:

INSERT INTO TEMP_HR_DATA
  SELECT EMPNO,
         UPPER(TRIM(ENAME)),
         UPPER(TRIM(JOB)),
         MGR,
         HIREDATE,
         SAL,
         COMM,
         DEPTNO
    FROM EXTERNAL_HR_DATA;

Share and enjoy.

like image 114

Check to see if your database has enough diskspace, and isn't too strained on it's RAM/CPU.

If that's OK: just do it. A million of lines isn't spectacular. Loading the file into a work table doesn't sound like something that would normally affect production performance. You could do the to_upper() in your sqlldr control-file (saves you an update on the work table). Maybe there is more post-processing that can be done while loading?

An external table (as suggested in the other answer) works fine as well, but has imho no other advantage than saving some disk space, while it does give some extra hassle to configure (create directory, grant access, transfer file to db server).

like image 38
Martin Schapendonk Avatar answered Nov 24 '25 23:11

Martin Schapendonk



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!