Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SICP Exercise 1.16 ... what does "invariant quantity" hint mean?

I see there are a few other questions around this exercise but none specifically are asking what is meant within the hint... "define the state transition in such a way that the product abn is unchanged from state to state".

They also mention that this idea of using an "invariant quantity" is a powerful idea with respect to "iterative algorithms". By the way, this problem calls for the design of a "logarithmic" exponent algorithm that has a space complexity of O(1).

Mainly I just have no idea what is meant by this hint and am pretty confused. Can anyone give me a nudge in what is meant by this? The only thing I can really find about "invariant quantities" are described using examples in physics which only makes this concept more opaque.

Exercise description in full:

Exercise 1.16: Design a procedure that evolves an iterative exponentiation process that uses successive squaring and uses a logarithmic number of steps, as does fast-expt. (Hint: Using the observation that (bn/2)2 = (b2)n/2, keep, along with the exponent n and the base b, an additional state variable a, and define the state transformation in such a way that the product abn is unchanged from state to state.

At the beginning of the process a is taken to be 1, and the answer is given by the value of a at the end of the process. In general, the technique of defining an invariant quantity that remains unchanged from state to state is a powerful way to think about the design of iterative algorithms.)

Thanks in advance.

like image 747
mzbaran Avatar asked Oct 24 '25 17:10

mzbaran


2 Answers

This is not about any "logarithmic exponentiation", which is a very vague and confusing terminology.

As the quote you provided says, it is about exponentiation function that takes logarithmic number of steps to get its final result.

So we want to develop a function, exp(b,n), which would take O(log n) time to finish.

The numbers involved are all integers.

So how do we calculate bn? We notice that

b^n = b * b * b * b *... * b 
      `------n times------/        O(n) time process

and, when n is even,

b^n = b * b * b * b * ... * b *  
      `------n/2 times-----/   
      b * b * b * b * ... * b   
      `------n/2 times-----/   
    = (b^(n/2))^2                ; viewed from the side
    = (b^2)^(n/2)                ; viewed from above

and when n is odd,

b^n = b * b * b * b * ... * b *  
      `------n/2 times-----/   
      b * b * b * b * ... * b *
      `------n/2 times-----/   
      b
      `--1 time--/
    = (b^(n/2))^2  *  b
    = (b^2)^(n/2)  *  b

Note that both expression fall into same category if we write the first one as (b^2)^(n/2) * 1.

Also note that the equation

  b^n = (b   )^(n  )  *  { a where a = 1   }
      = (b ^2)^(n/2)  *  { a where a = ... }
      = (b'  )^(n' )  *  { a' }

means that to calculate b^n * a is the same as to calculate b'^n' * a' with the changed values of { b' = b^2 ; n' = n/2 ; a' = {if {n is even} then {a} else {a * b}} }.

So we don't actually compute either of the sides of that equation. Instead we keep the triples { b, n, a } at each step and transform them according to that rule

 { b, n, a } --> { b', n', a' } --> ...

with the initial values of b and n as given to us, and the first a equal to 1; and know that the final result calculated from any of the triples would be the same if we'd actually calculated it somehow. We still don't know how exactly we'd do that; just that it would be the same. That's the "invariant" part.

So what all this is good for? Well, since the chain { n --> n/2 --> ... } will certainly reach a point where n == 1, we know that at that point we can break the chain since

 c^1 * d  ==  c * d

and this one simple multiplication of these two numbers will produce the same result as the initial formula.

Which is the final result of the function.

And that's the meaning of the hint: maintain the state of the computation as a (here) triple of numbers (or just three variables named b, n, a), and implement your computation as a chain of state-transforming steps. When the chain is broken according to some test (here, n == 1), we've reached our destination and can calculate the final result according to some simple rule (here, c * d).

This gives us a nice and powerful methodology for problem solving.

Oh, and we also know that the length of that chain of changing states is O(log n), since we halve that n at each step.

like image 174
Will Ness Avatar answered Oct 28 '25 04:10

Will Ness


when fast-exp execute, the trace looks something like this

b^14 = (b^2)^7 ; where: b' = b^2; a = 1, n = 7
b^14 = (b^2)^6 * b^2 ; where: b' = b^2; a = b^2, n = 6
b^14 = ((b^2)^2)^3 *b^2 ; where: b' = b^2^2 ; a = b^2, n = 3
b^14 = ((b^2)^2)^2 *b^2 *(b^2)^2 ; where: b' = b^2^2; a = b^2*(b^2)^2, n = 2

notice that all these statements have the general form of b^n = b'^n' * a', and this does not change

like image 39
MetalMentos Avatar answered Oct 28 '25 04:10

MetalMentos