I am having a ~90 MB database consisting mostly on message attachments including a BLOB column content, that stores the binary attachment data.
I assume it is not wise to create an index over a BLOBs, so no indexes involved apart from the autoindex.
For getting empty attachments, I compared the following querys:
SELECT message_id FROM attachments WHERE content IS NULL;
and
SELECT message_id FROM attachments WHERE length(content) = 0;
which result in the same rows in my usecase.
Why does the first one take 250ms and the second one only 1-2ms (both on a SSD)? What is the reason behind that? Is there a hidden length index or something? Any insight appreciated.
Additional info
The EXPLAIN QUERY PLAN in both cases is
0|0|0|SCAN TABLE attachments
The negation IS NOT NULL vs. length() != 0 results in the same performance difference 250ms vs. 2ms.
WHERE content IS NULL AND length(content) = 0; takes 250ms and WHERE length(content) = 0 AND content IS NULL; takes 2ms.These are simply different queries: LENGTH is a scalar function which returns (see here)
(i) NULL if the input is NULL
(ii) 0 if the input is a string of zero length (or if it is convertible to a string, resp.).
Therefore the condition length(content)=0 is true for content being an empty string, and false when content is NULL (because comparison with NULL always is false).
Based on this, I guess that your table contains several NULL fields and only a few which actually contain a value. This is supported also by your second additional info, where you say that IS NOT NULL shows a comparable performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With