We have a database that is currently running in AWS RDS on postgresql 9.5.4 and we're trying to upgrade it to run 9.6.6. We are experiencing strange performance degradation after the upgrade, even after (we think) successfully copying over all of the postgres settings into the RDS parameter group, and the below queries seem to be a "smoking gun", albeit one that we don't really understand.
On our 9.5.4 instance, the below queries all run fast (as you'd expect, given that the uuid and account_id columns are indexed):
production=> \timing
Timing is on.
production=> SELECT uuid FROM address WHERE account_id = 'Demo' LIMIT 1;
uuid
--------------------------------------
4c52c1fb-a344-4ea4-90f8-2f7f9b2cdce0
(1 row)
Time: 3.015 ms
production=> SELECT uuid FROM address WHERE uuid IN ('4c52c1fb-a344-4ea4-90f8-2f7f9b2cdce0');
uuid
--------------------------------------
4c52c1fb-a344-4ea4-90f8-2f7f9b2cdce0
(1 row)
Time: 0.886 ms
production=> SELECT uuid FROM address WHERE uuid IN (SELECT uuid FROM address WHERE account_id = 'Demo' LIMIT 1);
uuid
--------------------------------------
4c52c1fb-a344-4ea4-90f8-2f7f9b2cdce0
(1 row)
Time: 2.431 ms
Once we upgrade that DB to 9.6.6, the first two queries continue to be fast, but the last one becomes really slow:
production=> \timing
Timing is on.
production=> SELECT uuid FROM address WHERE account_id = 'Demo' LIMIT 1;
uuid
--------------------------------------
747b4b38-81f3-487e-8202-06c964e7e9f8
(1 row)
Time: 0.732 ms
production=> SELECT uuid FROM address WHERE uuid IN ('747b4b38-81f3-487e-8202-06c964e7e9f8');
uuid
--------------------------------------
747b4b38-81f3-487e-8202-06c964e7e9f8
(1 row)
Time: 0.715 ms
production=> SELECT uuid FROM address WHERE uuid IN (SELECT uuid FROM address WHERE account_id = 'Demo' LIMIT 1);
uuid
--------------------------------------
747b4b38-81f3-487e-8202-06c964e7e9f8
(1 row)
Time: 6676.759 ms
On the 9.6.6 box, the query planner doesn't hint at anything much (at least, that I can see):
production=> EXPLAIN SELECT uuid FROM address WHERE uuid IN (SELECT uuid FROM address WHERE account_id = 'Demo' LIMIT 1);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=5.23..13.27 rows=1 width=16)
-> HashAggregate (cost=4.67..4.68 rows=1 width=16)
Group Key: address_1.uuid
-> Limit (cost=0.56..4.66 rows=1 width=16)
-> Index Scan using address_account_id on address address_1 (cost=0.56..725.46 rows=177 width=16)
Index Cond: ((account_id)::text = 'Demo'::text)
-> Index Only Scan using address_pkey1 on address (cost=0.56..8.58 rows=1 width=16)
Index Cond: (uuid = address_1.uuid)
(8 rows)
Also, running standard pgbench tests on both boxes actually shows that the 9.6.6 box outperforms the 9.5.4 box in terms of transactions per second, so I don't think there is some weird hardware thing going on there.
Curious if anyone has any thoughts on where the strange drop in performance on that third query might be coming from?
Turned out this was because after the 9.5 -> 9.6 upgrade, you need to ANALYZE the entire DB to get the query planner humming again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With