Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to increase query efficiency for a large volume of data in a PostgreSQL database?

I have a PostgreSQL database with 1.2 billions rows, attempted to make an application that queries rows a million at a time, with an option to query larger intervals. At first I was just querying a database of a million to 10 million with ease;
Now that I'm querying a large database with an OFFSET the ResultSet takes a long time to generate.

   // ...
   stmt.setFetchSize(100000);
   ResultSet rs = stmt.executeQuery("SELECT mmsi, report_timestamp, position_geom, ST_X(position_geom) AS Long, "
                        + "ST_Y(position_geom) AS Lat FROM reports4 WHERE position_geom IS NOT NULL ORDER by report_timestamp ASC LIMIT "
                        + limit + " OFFSET " + set); 

So the ORDER BYis probably killing my execution time, but having the information ordered makes things easier later on. Is there a more efficient way to query the rows in intervals ?

like image 995
guy_sensei Avatar asked Nov 28 '25 23:11

guy_sensei


1 Answers

For this query:

SELECT mmsi, report_timestamp, position_geom, ST_X(position_geom) AS Long, "
                        + "ST_Y(position_geom) AS Lat
FROM reports4
WHERE position_geom IS NOT NULL
ORDER by report_timestamp ASC;

You should be able to use an index on an expression:

CREATE INDEX idx_reports4_position_ts ON reports4((position_geom IS NOT NULL), report_timestamp)

This index should be used directly for the query.

like image 198
Gordon Linoff Avatar answered Dec 01 '25 14:12

Gordon Linoff



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!