Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to prevent duplicate data on copy csv postgresql

This is more of a conceptual question because I'm planning how best to achieve our goals here.

I have a postgresql/postgis table with 5 columns. I'll be inserting/appending data into the database from a csv file every 10 minutes or so via the copy command. There will likely be some duplicate rows of data, so I'd like to copy the data from the csv file to the postgresql table but prevent any duplicate entries from getting into the table from the csv file. There are three columns, where if they are all equal, that will mean the entry is a duplicate. They are "latitude", "longitude" and "time". Should I make a composite key from all three columns? If I do that, will it just throw an error upon trying to copy the csv file into the database? I'm going to be copying the csv file automatically so I would want it to go ahead and copy the rest of the file that aren't duplicates and not copy the duplicates. Is there a way to do this?

Also, I of course want it to look for duplicates in the most efficient way. I don't need to look through the whole table (which will be quite large) for duplicates...just the past 20 minutes or so via the timestamp on the row. And I've indexed the db with the time column.

Thanks for any help!

like image 361
user1610717 Avatar asked Jan 27 '26 12:01

user1610717


1 Answers

Upsert

The Answer by Linoff is correct but can simplified a bit by Postgres 9.5 new ”UPSERT“ feature (a.k.a. MERGE). That new feature is implemented in Postgres as INSERT ON CONFLICT syntax.

Rather than explicitly check for violation of the unique index, we can let the ON CONFLICT clause detect the violation. Then we DO NOTHING, meaning we abandon the effort to INSERT without bothering to attempt an UPDATE. So if we cannot insert, we just move on to next row.

We get the same results as Linoff’s code but lose the WHERE clause.

INSERT INTO bigtable(col1, … )
    SELECT col1, …
    FROM stagingtable st
ON CONFLICT idx_bigtable_col1_col2_col
DO NOTHING
;
like image 156
Basil Bourque Avatar answered Jan 30 '26 05:01

Basil Bourque



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!