Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get high performance under a large transaction (postgresql)

I have data with amount of 2 millions needed to insert into postgresql. But it has played an low performance. Can I achieve a high-performance inserter by split the large transaction into smaller ones (Actually, I don't want to do this)? or, there is any other wise solutions?

like image 855
Mr rain Avatar asked Sep 01 '25 04:09

Mr rain


1 Answers

Possible methods to improve performance:

  1. Use the COPY command.
  2. Try to decrease the isolation level for the transaction if your data can deal with the consequences.
  3. Tweak the PostgreSQL server configuration. The default memory limits are very low and will cause disk trashing even with a server having gigabytes of free memory.
  4. Turn off disk barriers (e.g. nobarrier flag for the ext4 file system) and/or fsync on the PostgreSQL server. Warning: this is usually unsafe but will improve your performance a lot.
  5. Drop all the indexes in your table before inserting the data. Some indexes require pretty much work to keep up to date while rows are added. PostgreSQL may be able to create indexes faster in the end instead of continuously updating the indexes in paraller with the insertion process. Unfortunately, there's no simple way to "save" current indexes and update only the missing indexes later.

Splitting the insert job into series of smaller transaction will help only if you have to retry the transaction because of data dependency issues with paraller transactions. If the transaction succeeds on the first try, splitting it into several smaller transactions run in sequence will only decrease your performance.

like image 175
Mikko Rantalainen Avatar answered Sep 03 '25 01:09

Mikko Rantalainen