I have a 1:1 relationship between two tables. I want to find all the rows in table A that don't have a corresponding row in table B. I use this query:
SELECT id FROM tableA WHERE id NOT IN (SELECT id FROM tableB) ORDER BY id desc id is the primary key in both tables. Apart from primary key indices, I also have a index on tableA(id desc).
Using H2 (Java embedded database), this results in a full table scan of tableB. I want to avoid a full table scan.
How can I rewrite this query to run quickly? What index should I should?
The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table. This means that if the ON clause matches 0 (zero) records in the right table; the join will still return a row in the result, but with NULL in each column from the right table.
select tableA.id from tableA left outer join tableB on (tableA.id = tableB.id) where tableB.id is null order by tableA.id desc If your db knows how to do index intersections, this will only touch the primary key index
You can also use exists, since sometimes it's faster than left join. You'd have to benchmark them to figure out which one you want to use.
select id from tableA a where not exists (select 1 from tableB b where b.id = a.id) To show that exists can be more efficient than a left join, here's the execution plans of these queries in SQL Server 2008:
left join - total subtree cost: 1.09724:

exists - total subtree cost: 1.07421:

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With