This is my table schema
Column | Type | Modifiers
-------------+------------------------+------------------------------------------------------
id | integer | not null default nextval('message_id_seq'::regclass)
date_created | bigint |
content | text |
user_name | character varying(128) |
user_id | character varying(128) |
user_type | character varying(8) |
user_ip | character varying(128) |
user_avatar | character varying(128) |
chatbox_id | integer | not null
Indexes:
"message_pkey" PRIMARY KEY, btree (id)
"idx_message_chatbox_id" btree (chatbox_id)
"indx_date_created" btree (date_created)
Foreign-key constraints:
"message_chatbox_id_fkey" FOREIGN KEY (chatbox_id) REFERENCES chatboxes(id) ON UPDATE CASCADE ON DELETE CASCADE
This is the query
SELECT *
FROM message
WHERE chatbox_id=$1
ORDER BY date_created
OFFSET 0
LIMIT 20;
($1 will be replaced by the actual ID)
It runs pretty well, but when it reaches 3.7 millions records, all SELECT queries start consuming a lot of CPU and RAM and then the whole system goes down. I have to temporarily backup all the current messages and truncate that table. I am not sure what is going on because everything is ok when I have about 2 millions records
I am using Postresql Server 9.1.5 with default options.
Update the output of EXPLAIN ANALYZE
Limit (cost=0.00..6.50 rows=20 width=99) (actual time=0.107..0.295 rows=20 loops=1)
-> Index Scan Backward using indx_date_created on message (cost=0.00..3458.77 rows=10646 width=99) (actual time=0.105..0.287 rows=20 loops=1)
Filter: (chatbox_id = 25065)
Total runtime: 0.376 ms
(4 rows)
Update server specification
Intel Xeon 5620 8x2.40GHz+HT
12GB DDR3 1333 ECC
SSD Intel X25-E Extreme 64GB
Final solution
Finally I can go above 3 million messages, I have to optimize the postgresql configuration as wildplasser suggested and also make a new index as A.H. suggested
You could try to give PostgreSQL a better index for that query. I propose something like this:
create index invent_suitable_name on message(chatbox_id, date_created);
or
create index invent_suitable_name on message(chatbox_id, date_created desc);
Try adding an index for chatbox_id, date_created
. For this particular query it will give you maximum performance.
For the case, when postgres "start consuming a lot of CPU and RAM" try to get more details. It could be a bug (with default configuration postgres normally doesn't consume much RAM).
UPD My guess for the reason of bad performance:
At some point in time the table becomes to big for full scan to collect accurate statistics. After another ANALYZE
Postgresql got bad statistics for the table. As a result - got bad plan that consisted of:
chatbox_id
;Because of default configs and lots of records, returned on step 1, postgres was forced to do sorting in files on disk. As a result - bad performance.
UPD2 EXPALIN ANALYZE
shows 0.376 ms
time and a good plan. Can you give details about a case with bad performance?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With