We have a very large table with a total number of rows around ~4 billion
and everyday throughput is around ~20-25 million
.
Below is the Sample Table definition.
table_name (
id bigint,
user_id bigint,
status smallint,
date timestamp,
.
.
some more columns
);
NOTE: status is an enum and can have 6-7 possible values.
Below is the query that we want to optimise:
SELECT *
FROM table_name
WHERE user_id = $user_id
AND status in (1, 2, 3, 6, 7, 10)
ORDER BY date
DESC LIMIT 20;
Initially, we were having index1
on table_name (user_id, status)
.
Since this index was not giving optimal performance. We thought of including the date in the index as well.
Now, we have tried creating a bunch of different indexes on the table, but the explain plan always picks up the initial index i.e: index1.
Below are the indexes that we tried:
index2: table_name (user_id, status, date)
index3: table_name (user_id, status, date) where (status = ANY (ARRAY [1, 2, 3, 6, 7, 10]))
index4: table_name (user_id, status) where (status = ANY (ARRAY [1, 2, 3, 6, 7, 10]));
Below is the explain analyse output:
Limit (cost=27029.07..27029.12 rows=20 width=251) (actual time=32.466..32.470 rows=20 loops=1)
-> Sort (cost=27029.07..27049.17 rows=8041 width=251) (actual time=32.465..32.467 rows=20 loops=1)
Sort Key: id DESC
Sort Method: top-N heapsort Memory: 38kB
-> Index Scan using index1 on table_name wt (cost=0.58..26815.10 rows=8041 width=251) (actual time=0.027..26.991 rows=37362 loops=1)
Index Cond: ((user_id = 111111) AND (status = ANY ('{1,3,2,7,6,10,8}'::integer[])))
Planning Time: 0.320 ms
Execution Time: 32.498 ms
Our database postgres version is: 12.7
and we run vaccuming
regularly.
We want to understand why other indexes are not being used for our query.
Also given our use case, Can there be a better way of creating index so that we can serve the query in acceptable response time?
CREATE INDEX table_name_desc_index ON table_name (userid,_date DESC NULLS LAST);
Then try following:
SELECT *
FROM table_name
inner join (values (1),(2),(3),(6),(7),(10)) val(v) ON (table_name.status = v )
WHERE user_id = $user_id
ORDER BY date
DESC LIMIT 20;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With