Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

H2 database: slow query although index is used

Using the H2 1.3.176.

1) table definition:

CREATE TABLE TEST(ID BIGINT PRIMARY KEY, ACCOUNT BIGINT, TXID BIGINT); 

2) inserting values into the table:

INSERT INTO TEST SELECT X, RAND()*100, X FROM SYSTEM_RANGE(1, 1000000)

3) creating an index to use for my query:

CREATE Unique INDEX IDX_TEST_ACCOUNT_TXID ON `test` (account, txId DESC);

4) doing the following query:

explain analyze
select txid from test where account=22 AND txid<9999999 order by txid desc limit 25

I get the following execution plan:

SELECT
    TXID
FROM PUBLIC.TEST
    /* PUBLIC.IDX_TEST_ACCOUNT_TXID: ACCOUNT = 22
        AND TXID < 9999999
     */
    /* scanCount: 9867 */
WHERE (ACCOUNT = 22)
    AND (TXID < 9999999)
ORDER BY 1 DESC
LIMIT 25
/*
TEST.IDX_TEST_ACCOUNT_TXID read: 103
*/

Question: why does H2 need to scan through the entire index? I was expecting the scan count to be 25 since the txid in the index should be in descending order already so once H2 is in the account=22 branch of the index it should be able to just read the next 25 entries. This will lead to slow queries if there are millions of entries in the table. Even if H2 has to search for the first matching entry within the index I would expect this to be an O(log(N)) algorithm and not a scan. If I do the same thing without the column account (means that the table just contains id and txid) then a descending index on txid will indeed result in a scan count of 25 (using the query "select txid from test where txid<9999999 order by txid desc"). Why is the additional column ruining the execution plan? Maybe I don't understand how the index works. Is there a better way to define an index for my query?

like image 256
user3528637 Avatar asked Oct 21 '25 11:10

user3528637


1 Answers

I stepped through the h2 source code and found out what is going wrong:

During preparing the execution of the query, h2 tries to determine whether it can use the index for sorting and limiting the result set. Since the first index column (account) is not in the order by clause, h2 thinks it cannot use the index. This results in h2 scanning through the entire index fetching all rows and then sorting and limiting the result set. This is surprising since the condition on account is an "equal" condition so h2 should realize that it can indeed use the index for sorting and limiting the result set. The solution is to provide the account column in the order by clause. Thus the query should be:

select txid from test where account=22 AND txid<9999999 order by account, txid desc limit 25

and i get the expected execution plan

SELECT
    TXID
FROM PUBLIC.TEST
    /* PUBLIC.IDX_TEST_ACCOUNT_TXID: ACCOUNT = 22
        AND TXID < 9999999
     */
    /* scanCount: 25 */
WHERE (ACCOUNT = 22)
    AND (TXID < 9999999)
ORDER BY =ACCOUNT, 1 DESC
LIMIT 25
/* index sorted */

which has a scan count of only 25 :)

like image 151
user3528637 Avatar answered Oct 23 '25 14:10

user3528637



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!