I was having a problem with the query the other day. It took about 10 seconds for a large dataset. The query looked something like this:
SELECT a.* from Document as a
LEFT JOIN Waybill as b on a.waybill = b.id
WHERE a.enterpriseGuid = '763a3ac3-a3c7-4379-9735-2a4a96e87e5d'
OR b.enterpriseGuid = '763a3ac3-a3c7-4379-9735-2a4a96e87e5d'
This ran significantly slow. However, then I changed it to this:
SELECT a.* from Document as a
LEFT JOIN Waybill as b on a.waybill = b.id
WHERE a.enterpriseGuid = '763a3ac3-a3c7-4379-9735-2a4a96e87e5d'
UNION ALL
SELECT a.* from Document as a
LEFT JOIN Waybill as b on a.waybill = b.id
WHERE b.enterpriseGuid = '763a3ac3-a3c7-4379-9735-2a4a96e87e5d'
This took about 0.01 second, although the two queries basically produce the same result! I looked for the official MySQL documentation and I found an interesting remark here:
Indices lose their speed advantage when using them in OR-situations (4.1.10):
SELECT * FROM a WHERE index1 = 'foo' UNION SELECT * FROM a WHERE index2 = 'baar';
is much faster than
SELECT * FROM a WHERE index1 = 'foo' OR index2 = 'bar';
So, my question has 3 parts:
OR
is not per se bad. As with almost any other construct in SQL, it might or might not be a good idea.
You have found a problem with the optimizer . . . and one that is common to many databases. When your OR
conditions are from different tables, it is very difficult for the optimizer to take advantage of indexes.
Your improved solution works because each subquery can take advantage of indexes.
You might find that the following works better than the first version but worse than the second:
SELECT d.*
FROM Document d
WHERE d.enterpriseGuid = '763a3ac3-a3c7-4379-9735-2a4a96e87e5d' OR
(EXISTS (SELECT 1
FROM Waybill b
WHERE d.waybill = b.id AND
b.enterpriseGuid = '763a3ac3-a3c7-4379-9735-2a4a96e87e5d'
)
);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With