This has bothered me for a long time. I want an iterator over a binary tree (or similar nested structure) that's efficient, simple, and pythonic. For example for usages like these:
for value in values(root):
do_something_with_value(value)
print(sum(values(root)))
Here root
is the tree's root node, and nodes have a .value
, .left
and .right
. And values
is supposed to give me the desired iterator/iterable over the tree's values.
Let n be the number of nodes in the tree and h be the height of the tree.
def values(root):
if root:
yield from values(root.left)
yield root.value
yield from values(root.right)
Simple, pythonic, lazy, and takes only O(h) space. This should be it. But... it's stacked generators and every single value gets yielded through the entire stack of generators. So the whole iteration takes O(n2) time in the worst case and O(n log n) time even in the best case.
def values(root):
stack = []
while root or stack:
while root:
stack.append(root)
root = root.left
root = stack.pop()
yield root.value
root = root.right
Iterative with a stack of nodes. Only takes O(h) space and O(n) time, but it's much more complicated than the above totally natural recursion.
def values(root):
result = []
def collect_values(root):
if root:
collect_values(root.left)
result.append(root.value)
collect_values(root.right)
collect_values(root)
return result
This collects all values in a semi-global list. Natural recursion and O(n) time, but sadly O(n) space and isn't lazy.
Instead of a semi-global list I thought maybe I could abuse a semi-global generator. As a sort of pipe from inside the recursion directly to the outside. The recursion would send
values into it, and the outside could fetch them. Something like this:
def values(root):
pipe = magic_generator_pipe()
def iterate(root):
if root:
iterate(root.left)
pipe.send(root.value)
iterate(root.right)
iterate(root)
yield from pipe
But I can't get it to work and I'm not sure it's possible.
Something with threading
or asyncio
? One more idea I have is that the values
function starts a new thread. This thread recursively iterates the tree and communicates the values to the main thread in the values
function, which yields them to the original outside caller. And they lock each other appropriately. But I don't have much experience with this.
Is there a way to achieve all I want?
So essentially I want something like Attempt 1, but fast. Because I've used recursive yield from
for several problems and always feel bad about the time complexity.
To clarify: With "too complicated" I really meant the iterative algorithm (it's not that complicated, but compared to the natural recursion it is). A solution that's "complicated" only in a "technical" way (like with an extra thread or @chepner's trampoline idea) would still be interesting. I just insist on the natural recursion (or something similarly algorithmically simple) and the other three goals.
My approach may not satisfy you since it's an inversion of what you are doing, namely using a callback. You pass to a function that is able to iterate the structure a callback method to be invoked for each value encountered. In this example, function walk
is called with callback
function. In this case the structure is a tree that is walked and for each node value, the callback function is invoked.
def walk(root, callback):
def tree_walk(node):
if node['left']:
tree_walk(node['left'])
callback(node['value'])
if node['right']:
tree_walk(node['right'])
tree_walk(root)
node4 = {'value': 4, 'left': None, 'right': None}
node3 = {'value': 3, 'left': node4, 'right': None}
node2 = {'value': 2, 'left': None, 'right': None}
node1 = {'value': 1, 'left': node2, 'right': node3}
def do_something_with_value(value):
print('value:', value)
walk(node1, do_something_with_value)
Prints:
value: 2
value: 1
value: 4
value: 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With