While executing below two queries, I notice serious difference in query plan. Why is that?
select * from table1
where id = 'dummy' or id in (select id from table2 where id = 'dummy')
Query plan
Seq Scan on table1 (cost=8.30..49611.63 rows=254478 width=820) (actual time=535.477..557.431 rows=1 loops=1)
Filter: (((code)::text = 'dummy'::text) OR (hashed SubPlan 1))
Rows Removed by Filter: 510467
SubPlan 1
-> Index Scan using idx on table2 (cost=0.29..8.30 rows=1 width=8) (actual time=0.009..0.012 rows=0 loops=1)
Index Cond: ((id)::text = 'dummy'::text)
Planning Time: 0.165 ms
Execution Time: 557.517 ms
select * from table1
where id = 'dummy'
union
select * from table1
where id in (select id from table2 where id = 'dummy')
Unique (cost=25.22..25.42 rows=2 width=5818) (actual time=0.045..0.047 rows=1 loops=1)
-> Sort (cost=25.22..25.23 rows=2 width=5818) (actual time=0.045..0.046 rows=1 loops=1)
Sort Method: quicksort Memory: 25kB
-> Append (cost=0.42..25.21 rows=2 width=5818) (actual time=0.016..0.026 rows=1 loops=1)
-> Index Scan using id on table1 (cost=0.42..8.44 rows=1 width=820) (actual time=0.015..0.016 rows=1 loops=1)
Index Cond: ((id)::text = 'dummy'::text)
-> Nested Loop (cost=0.71..16.74 rows=1 width=820) (actual time=0.009..0.009 rows=0 loops=1)
-> Index Scan using idx on table2 (cost=0.29..8.30 rows=1 width=8) (actual time=0.008..0.008 rows=0 loops=1)
Index Cond: ((id)::text = 'dummy'::text)
-> Index Scan using pkey on table1 (cost=0.42..8.44 rows=1 width=820) (never executed)
Index Cond: (id = table2.id)
Planning Time: 0.753 ms
Execution Time: 0.131 ms
So the main difference you can see is the first query returns 254478 rows but the second just returns 2 rows. Why is that?
Please do another test -- run both these queries -- do they give the same results as the queries without my changes?
select * from table1
where table1.id = 'dummy' or
table1.id in (select table2.id from table2 where table2.id = 'dummy')
select * from table1
where table1.id = 'dummy'
union
select * from table1
where table1.id in (select table2.id from table2 where table2.id = 'dummy')
I don't think you are sharing your actual code with us -- because as written your code makes little sense -- you are returning a list of ids in the sub-query that equal 'dummy' -- so you will just get a list of dummy multiple times.
Note these comments are not true since they had no impact on the results -- the order of operations was working as expected
What result do you get when when you do this:
select * from table1
where (id = 'dummy') or id in (select id from table2 where id = 'dummy')
The reason your query was giving more results is because it was selecting records from table1 where id equals dummy or id = id. The query in the original post gives you all the records. The OR was being applied to the first expression not splitting two expressions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With