Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the max value of node in a tree

I have a problem.

I have to implement a function maxT in Haskell which returns the maximum value of a node from a binary tree.

data Tree a = Leaf a | Node (Tree a) a (Tree a)

This is given. What should I do next?

maxT :: (Tree Integer) -> Integer
maxT (Leaf a) = a
maxT (Node l a r) = max a (max (maxT l) (maxT r))

Is this right?

like image 291
user2379123 Avatar asked Jan 22 '26 06:01

user2379123


1 Answers

Let's see how hard this is to prove correct. Why? Because it's a great way to analyze programs for errors. Especially recursive ones. We'll technically use induction, but it isn't so complex. The key is to realize that maxT t must always be the largest value in the tree t—this declaration, "maxT t must always be the largest value in the tree t" is called an invariant and we'll try to prove it.

First, let's assume t is a Leaf. In this case, you've defined maxT (Leaf a) = a and since there are literally no other values in this tree, a must be the largest. Thus, maxT upholds our invariant when passed a Leaf. This is the "base case".


Now we'll consider what happens when we let t = Node (Leaf a) b (Leaf c) for some Integers a, b, and c. This is a height-1 tree and forms what you might call an "example case" for induction. Let's try out maxT and see if the invariant holds.

maxT t 
===
maxT (Node (Leaf a) b (Leaf c))
===
max b (max (maxT (Leaf a)) (maxT (Leaf c)))

at this point we'll use our base-case step and say that since the only applications of maxT in this expression are on Leafs then each one must uphold our invariant. This is kind of dumb, but that's because it's just an example case. We'll see the more general pattern later.

For now, let's evaluate our maxT (Leaf _) bits knowing that the result is the maximal value in each particular left- or right-subtree.

===
max b (max a c)

Now, I don't much want to dive into the definition of max, but based on its name I'm happy to assume that max a b returns value that is maximal between a and b. We could pick our way through the details here, but it's clear that max b (max a c) has been given all the relevant information about our Node for computing the maximum of the entire height-1 tree. I'd call this a successful proof that maxT works for both height-0 and height-1 trees (Leafs and Nodes containing only Leafs).

The next step is to generalize this example case.


So let's apply that same pattern generalizing on the height of the tree. We'll ask what happens if we fix some number, n, and assume that maxT t upholds our invariant for all t of height n or less. This is a little bizarre—we have only shown this works for n = 0 and n = 1. It'll be clear why this works a little later.

So what does that assumption do for us? Well, let's take any two Trees of height n or less (call them l and r), any integer x, and combine them to form a new tree t = Node x l r. What happens when we do maxT t?

maxT t
===
maxT (Node x l r)
===
max x (max (maxT l) (maxT r))

and we know, per our assumption, that maxT l and maxT r uphold our invariant. Then the chain of maxes continues to uphold our invariant now for a tree t that's height-(n+1). Furthermore (and this is really important) our process of assembling new Trees is general—we can make any height-(n+1) tree in this method. This means that maxT works for any height-(n+1) tree.


Induction time! We now know that if we pick an n and believe (for some reason) that maxT works for any height-n tree, then it immediately must work for any height-(n+1) tree. Let's pick n = 0. We know by the "base case" that maxT works for Leafs, so suddenly we know that maxT works for height-1 trees. This was our "example case". Now, given that knowledge, we can immediately see maxT works for height-2 trees. And then height-3 trees. And then height-4. And so on and on and on.

This completes a proof* that maxT is correct.

*I have to leave a few caveats. We didn't really do the fiddly details to show that the max chains work out, though it makes sense. I also didn't really prove that the induction step works—what if there were more ways to make a height-(n+1) tree than just using Node on height-n or lesser trees? The more powerful way is to "break apart" a height-n tree, but that's a little harder to see, I think. Finally, we would want to really think hard about what happens if we send in maxT (Leaf undefined) or other pathological values like that. These arise in Haskell because it's a (turing-complete) computer language instead of pure math. Honestly, these little bits don't change a whole lot for your situation, though.

like image 125
J. Abrahamson Avatar answered Jan 24 '26 20:01

J. Abrahamson



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!