How are duplicate keys handled in InnoDB's implementation of B+ tree for it's indexes.
For example, if there is a table with 1 million rows having a column with cardinality of 10. If we create an index on this column, how will the resulting B+ tree would look like?
Will it just have 10 keys and the value of each key is the list of primary keys which belong to that key (if yes, in what structure? Linked list?) or will it have 1M keys (if yes, then B+ tree would have to be handled differently)?
In some sense, an InnoDB BTree has no duplicates. This is because the columns of the PRIMARY KEY are appended to the columns specified for a secondary key. That leads to a fully-ordered list.
When you lookup via a secondary key (or the initial part of a key), the query will drill down the BTree to find the first row in the index matching what you gave, then scan forward to get any others. To get the rest of the columns, it takes the PRIMARY KEY columns to do a second BTree lookup.
The Optimizer will rarely use an index with "low cardinality". For example, a yes/no or true/false or male/female column should not be indexed. The Optimizer would find it faster to simply scan the table rather than bounce back and forth between the index and (via the PK columns) the main BTree.
The cutoff for when to use the index versus punting is somewhere around 20%, depending on the phase of the moon.
The case you propose is a bad one for a B+ tree. A cardinality of 10 means only 10 of the 1 million values are unique. Actually it is not only bad for a B+ tree, it is a bad index in general. Based on this index you will on average be left with a subset of approx. 100,000 values, which you either have to look through or use another value to filter further.
Concerning the structure of the resulting tree there are some things to keep in mind here:
- Inserts may require splits if the leaf node is full
- Occasionally the split of a leaf node necessitates split of the next higher node
- In worst case scenarios the split may cascade all the way up to the root node
https://www.percona.com/files/presentations/percona-live/london-2011/PLUK2011-b-
- Leaf nodes are linked together as doubly linked list
- […]
- Entire tree may be scanned without visiting the higher nodes at all
https://www.percona.com/files/presentations/percona-live/london-2011/PLUK2011-b-
If you insert a lot of data with keys which more or less belong all to the same equivalence class, I would expect a tree, which will not help a lot. The 10 keys might be present solely in the root node, and all data deeper in the tree will just be unsorted (because there is nothing left to sort it).
Due to the fact that the leafs are double-linked lists you are basically left with what I've written in the beginning: You have to traverse a big subset of the values. Concerning the given index this had to be expected and the B+ tree might doing well given the circumstances (a list is ok for just going through all data).
Actually this goes one abstraction deeper: The leafs are double-linked, but there are multiple values in each leaf (data or link to PK). Nevertheless these are in a list too, so if you just traverse everything it makes not much of a difference.
Please see that you can also investigate what MySQL is really building. There are tools to inspect the built index data structures, see for example
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With