I have two tables
Table X: millions or records
|-----|-----|-----|-----|
| a | b | c | d |
|-----|-----|-----|-----|
Table Y: only a few records
|-----|-----|
| e | f |
|-----|-----|
X.d allows me to join both tables on X.d = Y.e
I have the following indices:
One of our application was executing the following query, which took ages to run:
SELECT *
FROM X
INNER JOIN Y ON X.d = Y.e
WHERE
X.a in (1, 2, 3)
AND X.b IS NULL
AND X.c in (4, 5 ,6)
AND X.d in (7, 8, 9)
After changing the INNER JOIN to a LEFT JOIN, the query was extremely fast:
SELECT *
FROM X
LEFT JOIN Y ON X.d = Y.e
WHERE
X.a in (1, 2, 3)
AND X.b IS NULL
AND X.c in (4, 5 ,6)
AND X.d in (7, 8, 9)
Looking at explain plans for these queries, first query is doing a full scan when the second is only doing an Index Scan (range) on my compound index.
I saw other posts on SO but they had different scenarios.
Why such a diffence in the plans ?
The reason for the different plans is that LEFT JOIN will force the join order of your tables to match the order they appear in your query. Without the left join, the optimizer will choose the join order for you, and in this case it will choose the very small table first. (You can see this in your explain by looking at the order the tables are listed.) Once your join order is switched, the index for X changes to KEY d which must have a much larger data set than the compound key.
To fix this, change your select to SELECT STRAIGHT_JOIN *. This is preferred over USE INDEX so that the optimizer can still choose the best key for table X... You might find a better compound key than a,b,c,d, or if your data in X changes dramatically, one of your other keys may be better after a point.
I have to point out, that you normally can't just switch to a LEFT JOIN. The data returned will usually be different!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With