PostgreSQL performance tuning with table partitions

Question

I am solving an performance issue on PostgreSQL 9.6 dbo based system. Intro:

12yo system, similar to banking system, with most queried primary table called transactions.

CREATE TABLE jrn.transactions (
     ID BIGSERIAL,
     type_id VARCHAR(200),
     account_id INT NOT NULL,
     date_issued DATE,
     date_accounted DATE,
     amount NUMERIC,
     ..
)

In the table transactions we store all transactions within a bank account. Field type_id determines the type of a transaction. Servers also as C# EntityFramework Discriminator column. Values are like:

card_payment, cash_withdrawl, cash_in, ...

14 types of transaction are known.

In generally, there are 4 types of queries (no. 3 and .4 are by far most frequent):

select single transaction like: SELECT * FROM jrn.transactions WHERE id = 3748734
select single transaction with JOIN to other transaction like: SELECT * FROM jrn.transactions AS m INNER JOIN jrn.transactions AS r ON m.refund_id = r.id WHERE m.id = 3748734
select 0-100, 100-200, .. transactions of given type like: SELECT * FROM jrn.transactions WHERE account_id = 43784 AND type_id = 'card_payment' LIMIT 100
several aggregate queries, like: SELECT SUM(amount), MIN(date_issued), MAX(date_issued) FROM jrn.transactions WHERE account_id = 3748734 AND date_issued >= '2017-01-01'

In last few month we had unexpected row count growth, now 120M.

We are thinking of table partitioning, following to PostgreSQL doc: https://www.postgresql.org/docs/10/static/ddl-partitioning.html

Options:

partition table by type_id into 14 partitions
add column year and partition table by year (or year_month) into 12 (or 144) partitions.

I am now restoring data into out test environment, I am going to test both options.

What do you consider the most appropriate partitioning rule for such situation? Any other options?

Thanks for any feedback / advice etc.

Laurenz Albe · Accepted Answer

Partitioning won't be very helpful with these queries, since they won't perform a sequential scan, unless you forgot an index.

The only good reason I see for partitioning would be if you want to delete old rows efficiently; then partitioning by date would be best.

Based on your queries, you should have these indexes (apart from the primary key index):

CREATE INDEX ON jrn.transactions (account_id, date_issued);
CREATE INDEX ON jrn.transactions (refund_id);

The following index might be a good idea if you can sacrifice some insert performance to make the third query as fast as possible (you might want to test):

CREATE INDEX ON jrn.transactions (account_id, type_id);

PostgreSQL performance tuning with table partitions

Tags:

postgresql

Luke1988

1 Answers

Laurenz Albe

Recent Activity

Donate For Us

PostgreSQL performance tuning with table partitions

Tags:

postgresql

Luke1988

1 Answers

Laurenz Albe

Related questions

Recent Activity

Donate For Us