Let's say you have any insert query as below -
insert into abc select * from def dd where not exists (select 1 from abc aa where aa.id = dd.id);
Vs
insert into abc select * from def dd on conflict(id) do nothing;
which one is faster?
Both tables are really large and the idea is to not insert if there is a conflict with the primary key
Which is faster is not really relevant. Actually, both should be very comparable, because both will be looking up the value using a unique index.
More important is that NOT EXISTS is not semantically correct.  Race conditions could cause two queries to both fail the NOT EXISTS . . . and then both would try to insert the same value.
So, for safety, I strongly advise using the ON CONFLICT formulation.  It ensures that the database ensure the consistency of the data.
Which is faster often depends on specifics. A blanket answer is not usually possible. And while they are subtly different they are close enough that they may be substitutes in a given circumstance.
The first is an antijoin while the second is a simple select with instructions to continue on conflict. This means that the planner may be able to use information to effectively run it faster. So in general, NOT EXISTS should perform better.
But this is not always the case. If the two tables are big enough, you might have a more expensive join strategy and that could be a problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With