Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

B-Tree enhancement - order(k) function to display key ranking in sorted order

I have encountered a question in my Data Structures homework that neither me neither anyone of my colleagues could figure out, we even have no idea of where to start!

The question states that we should suggest an enhancement to B-Tree; A function order(k) - where k is a key in the B-Tree - that would display in O(log n) the key's place in the sorted order of all the keys in the B-Tree. We also need to show that the "enhancement" does not affect the complexity of the regular abstract functions of the B-Tree. We can use O(n) extra space, where n is the number of keys in the B-Tree.

Further explanation: Take for example a B-Tree that has the keys A B C D E F G H I J K L M N.

  • order(A) result should be "1".
  • order(N) result should be "14".
  • order(I) result should be "9".

What I have figured out so far:

  1. Given that we are allowed to use an O(n) extra space, and that the B-Tree regular space is O(n), we should -probably- use an extra B-Tree for the help.

  2. The fact that they mentioned that we should show that the enhancement does not affect the complexity of the regular B-Tree functions, at some point we have to manipulate the regular abstract B-Tree functions in some way, in a way that does not affect their regular complexity.

  3. The fact that we have to order(k) in O(log n) suggests that we should go through the B-Tree in a height-based way, not node by node.

  4. Somewhere, probably, we have to check if the given k in order(k) actually exists in the B-Tree, I suggest the regular abstract search function of the B-Tree.

like image 533
TheNotMe Avatar asked Dec 06 '25 02:12

TheNotMe


1 Answers

At each key, you should store extra data that records how many keys are under that node (including in the node itself).

To maintain this, the insert(k) function would have to travel back up through all of the ancestors of the new key, k, and increment their values. This would make insert O(log n) + O(log n), which is still O(log n), and thus does not affect the complexity. The delete(k) would have to do the same thing, except decrement the values. Balancing operations would also have to take this into account.

Then, order(k) would travel down the tree to k: each time it travels to a node, it should add the count of how many keys are to the left side, to the total, and return this sum.

EDIT: I changed the ambiguity of "node" between node and key, as these are different in a B-tree (a node can contain multiple keys). However, the algorithm should generalize to most tree-data-structures.

This is the algorithm for the B-tree:

#In python-ish (untested psuedocode)
#root is the root of the tree
#Each node is expected to have an array named "keys",
# which contains the keys in the node.
#Each node is expected to have an array named "child_nodes",
# which contains the children of the node, if the node has children.
#If a node has children, this should be true: len(child_nodes) == len(keys) + 1

def inorder(q):
  order_count = 0
  current_node = root

  while True:
    #if q is after all keys in the node, then we will go to the last child node
    next_child_node_i = len(current_node.keys)

    #now see if q is in between any of the nodes
    #for each key-index in the keys array (ie. if the node contains 3 keys,
    # keyi will be in range [0-2] .)
    for keyi in range(len(current_node.keys)):
      #retrieve the value of the key, so we can do comparison
      current_key = current_node.keys[keyi]

      if current_key < q:
        #We are trying to find which child node to go down to next,
        # for now we will choose the child directly to the left of this key,
        #But we continue to look through the rest of the keys, to find which
        # two keys q lies in between.

        #before we continue, we should count this key in the order too:

        #if this is not a leaf node,
        if len(current_node.children) != 0:
          #retrieve the the recorded child count of the sub-tree
          order_count += current_node.children[keyi].recorded_descendant_key_count

        #add one for the key in this node that we are skipping.
        order_count += 1

        continue

      if q < current_key:
        #We found a key in the current node that is greater than q.
        #Thus we continue to the next level between this and the previous key.
        next_child_node_i = keyi
        break

      #we finally found q,
      if q == current_key:
        #now we just return the count
        return order_count

    #once we are here, we know which keys q lies between
    # (or if it belongs at the beginning or end), and thus which child to travel down to.

    #If this is a leaf node (it has no children),
    # then q was not found.
    if len(current_node.child_nodes) == 0:
      #Possible behaviors: throw exception, or just return the place in the order
      # where q *would* go, like so:
      return order

    #Travel down a level
    current_node = current_node.child_nodes[next_child_node_i]
like image 92
Realz Slaw Avatar answered Dec 08 '25 06:12

Realz Slaw



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!