Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trouble with successor & predecessor in Mogensen's binary encoding

I would like to add binary numbers to my untyped lambda calculus library, but I'm stuck with the succ and pred functions. I am using a representation outlined in a paper by T. Mogensen and while most of the functions defined there work, succ and pred are returning erroneous results.

I'm pretty sure I got the representation right:

dec bin   De Bruijn       classic
0   0     λλλ3            λa.λb.λc.a
1   1     λλλ13           λa.λb.λc.c a
2   10    λλλ2(13)        λa.λb.λc.b (c a)
3   11    λλλ1(13)        λa.λb.λc.c (c a)
4   100   λλλ2(2(13))     λa.λb.λc.b (b (c a))
5   101   λλλ1(2(13))     λa.λb.λc.c (b (c a))
6   110   λλλ2(1(13))     λa.λb.λc.b (c (c a))
7   111   λλλ1(1(13))     λa.λb.λc.c (c (c a))
8   1000  λλλ2(2(2(13)))  λa.λb.λc.b (b (b (c a)))

The tuples and projections also look fine:

tuple         De Bruijn               classic
[T, F]        λ1(λλ2)(λλ1)            λa.a (λb.λc.b) (λb.λc.c)
[T, F, F]     λ1(λλ2)(λλ1)(λλ1)       λa.a (λb.λc.b) (λb.λc.c) (λb.λc.c)
[T, F, F, T]  λ1(λλ2)(λλ1)(λλ1)(λλ2)  λa.a (λb.λc.b) (λb.λc.c) (λb.λc.c) (λb.λc.b)

πkn  De Bruijn  classic
π12  λ1(λλ2)    λa.a (λb.λc.b)
π22  λ1(λλ1)    λa.a (λb.λc.c)

Shifting up with a 0-bit (shl0) and 1-bit (shl1) works well in tests:

SHL0 ≡ λnbzo.z (n b z o) = λ λ λ λ 2 (4 3 2 1)
SHL1 ≡ λnbzo.o (n b z o) = λ λ λ λ 1 (4 3 2 1)

But succ and pred that depend on the terms listed above don't:

SUCC ≡ λn.π22 (n Z A B) ≡ λ π22 (1 Z A B) where
Z ≡ [ZERO, ONE] // encoded like in the first piece of code
A ≡ λp.p (λnm.[SHL0 n, SHL1 n]) ≡ λ 1 (λ λ [SHL0 2, SHL1 2])
B ≡ λp.p (λnm.[SHL1 n, SHL0 m]) ≡ λ 1 (λ λ [SHL1 2, SHL0 1])

PRED ≡ λn.π22 (n Z A B) ≡ λ π22 (1 Z A B) where
Z ≡ [ZERO, ZERO] // encoded like in the first piece of code
A ≡ λp.p (λnm.[SHL0 n, SHL1 m]) ≡ λ 1 (λ λ [SHL0 2, SHL1 1])
B ≡ λp.p (λnm.[SHL1 n, SHL0 n]) ≡ λ 1 (λ λ [SHL1 2, SHL0 2])

Example results:

succ 0 = λa.λb.λc.c a / λλλ13 ok
succ 1 = λa.λb.λc.b (b c) / λλλ2(21) wrong, expected λλλ2(13)
succ 2 = λa.λb.λc.c (b (c (λd.λe.λf.e (b d e f)) (λd.λe.λf.f (b d e f)))) / λλλ1(2(1(λλλ2(5321))(λλλ1(5321)))) wrong, expected λλλ1(13)
succ 3 = λa.λb.λc.b (b c) / λλλ2(21) wrong, expected λλλ2(2(13))

pred 1 = λa.λb.λc.b a / λλλ23 wrong-ish, expected λλλ3; it's just a leading zero, but it's stated that those should only be caused by inputs that are powers of 2
pred 2 = λa.λb.λc.c (b c) / λλλ1(21) wrong, expected λλλ13
pred 3 = λa.λb.λc.b (b a) / λλλ2(23) wrong, expected λλλ2(13)
pred 4 = λa.λb.λc.c (b c) / λλλ1(21) wrong, expected λλλ1(13)

My term evaluator was tested against hundreds of terms, so I'm pretty confident in it; I'm suspecting that either I'm misreading something or something is misprinted. Am I missing something?

like image 447
ljedrz Avatar asked Oct 28 '25 13:10

ljedrz


2 Answers

So, as ljedrz mentionned, we manage to get Morgensen numerals working in a separate chat. In this answer, I'll just describe shortly how it works in general.

The question was: « I'm suspecting that either I'm misreading something or something is misprinted. Am I missing something? »


tl;dr: it turns out some tricky stuff related to evaluation order caused the problem. The Morgensen numerals as presented in the question do work.


Longer answer: how does succ work ?

N.B.: in the following b_n is always assumed to be 1, as in the original paper.

The idea behind Morgensen numerals is to have a number n = b_n ... b_2 b_1 encoded as \z.\x_0.\x_1. x_{b_1} ( x_{b_2} (... ( x_{b_n} z ) ...) ). This is pretty unreadable, but it becomes clearer if it is stated that way:

A number n is a term which expects 3 arguments, and when applied, returns x_{b_1} ( x_{b_2} (... ( x_{b_n} z ) ...) )

Well, that's still unclear. If we look deeper, we see that a number n apply recursively either x_0 or x_1, starting from a term z. Notice that the recursive call is made "from left to right", i.e., if I have a number b_n b_{n-1} ... b_2 b_1, then the recursive calls are evaluated in this order:

  • first b_n z, let it be i_{n-1}
  • then b_{n-1} i_{n-1}, let it be i_{n-2}
  • ...
  • and finally i_1 b_1

(Well, the evaluation strategy determines the exact evaluation order, to I think it is easy to think it evaluates like that)


Relation with fold function on lists

Actually, when I realised that, it just made me think about the fold_left function of a bit list: say you have a list of bits l = [b_n; ... ; b_2; b_1], then you can do the following:

fold_left (fun prev_acc -> fun b -> if b = 0 then x_0 prev_acc else x_1 prev_acc) z l

let f be

fun prev_acc -> fun b -> if b = 0 then x_0 prev_acc else x_1 prev_acc

which returns (according to Ocaml doc)

f (f (... (f z b_n) ...) b_2) b_1

which evaluates to:

  • f z b_n evaluates to x_{b_n} z, that is i_{n-1} as above.
  • ...
  • f i_{1} b_1, as above.

Conclusion, you can absolutely think of Morgensen numerals as a fold_left on a list (or fold_right, depending on how you imagine the list).


Getting the succ of a number

Getting the succ of a number is getting n+1. The binary increment as a nice property:

if m = bn ... bi bj bk ... b1 with bj being the first 0 (i.e. bk = ... = b1 = 1), then m + 1 = bn ... bi 1 0 ... 0

This can be illustrated:

bn ... bi  0  1  1  1  1  1

if I add 1, then I got (by detailling all steps):

bn ... bi  0  1  1  1  1  1
                         +1
--------------------------
bn ... bi  0  1  1  1  1  0
                      +1     < I have a carry here, which gets propagated
...
--------------------------
bn ... bi  0  0  0  0  0  0
          +1                 < The carry ends up here
--------------------------
bn ... bi  1  0  0  0  0  0  < This is our result of doing a +1.

A good remark is to notice that (bn ... bi 0 1 ... 1) + 1 is (bn ... bi 0) + 1 appended to 0 ... 0, and more generally, it also applies for any bj: (bn ... bi bj 1 ... 1) + 1 is (bn ... bi bj) + 1 appended to 0 ... 0.

This seems pretty good, only one problem, the carry is propagated from right to left (LSB to MSB), while Morgensen numerals are from MSB to LSB.

To solve this last problem, we can be speculative: suppose I have a number b_n ... bi bj bk ... b1, and I want to have its successor. I will have to compute it reccursively, but only from MSB to LSB.

That is, if I am at "step bj", I can only work with the subsequence bn ... bi and bj itself.

This allows us, for instance, to compute the succ of bn ... bi. Now comes the speculative part:

  • I know that, if after bj, there are only 1, then after the previous remark, then the successor is ((bn ... bi bj) + 1)::(0 ... 0)
  • However, if there is a 0 in bk ... b1, then the bits (bn ... bi bj) remains unchanged.

So the idea is to return both possibilities at each bit, in a tuple. Informally, the function passed to fold_left looks like that:

fun tuple_msb -> fun bj -> 
(original_msb, incr_msb)

where (1) tuple_msb is a tuple containing (bn ... bi, (bn ... bi) + 1); and where (2) original_msb and incr_msb are computed depending on bj. Indeed:

  • if bj is 0, then (bn ... bi bj) + 1 = (bn ... bi 0) + 1 = (bn ... bi 1)
  • if bj is 1, then (bn ... bi bj) + 1 = (bn ... bi 1) + 1 = ((bn ... bi) + 1)::0.

That is, the complete function to pass to fold_left is the following:

(* We keep the original sequence on the left of the tuple, the incremented `bn ... bi bj` on the right *)
fun tuple_msb -> fun bj ->
if bj = 0 then
  (tuple_msb._1 :: 0, tuple_msb._1 :: 1) 
else 
  (tuple_msb._1 :: 1, tuple_msb._2 :: 0) 

And the base case (i.e. the starting element is the tuple (0, 1))

And from here, it is easy to go back to Morgensen unreadable terms (there is a small hidden shortcut here about the order of the arguments, but it really doesn't matter):

We can identify fun tuple_msb -> (tuple_msb._1 :: 0, tuple._1 :: 1) as being x_0 and fun tuple_msb -> (tuple_msb._1 :: 1, tuple_msb._2 :: 0) as being x_1 according to the notation we had at the begining for x_0 and x_1, and the base case (i.e. z at the begining is (0, 1)).

To get the final successor, we have to get the right part of the returned tuple, hence the final

let succ n =
  let ret_tuple = n z x_0 x_1 in
  ret_tuple._2

or in lambda terms:

succ' = λn. π22 (n z x_0 x_1)

with all π22, z, x_0 and x_1 accordingly defined.

Our succ' is a bit different from the proposed succ, i.e. x_0 is not exactly A and x_1 is not exactly B, but this last step is easy and left for the interested reader ;-)

like image 181
Bromind Avatar answered Oct 30 '25 09:10

Bromind


When Bromind informed me that their slightly different definition of succ works in Ocaml+utop I ported it and tested with my library. It failed as well, though, so I started analyzing all the terms involved only to see that everything looked pretty much the same as in my implementation. Convinced that both my evaluator is solid and those definitions are valid, I kept looking until I found the actual cause.

Not to spoil the fun too soon, though, I'll add that I actually contacted prof. Mogensen about this and he was kind enough to provide me with high-level evaluation steps for the expression succ 1:

succ 1
 = (λn.π² (n Z A B)) |1|
 = (λn.π² (n Z A B)) (λz01.1 z)
 = π² ((λz01.1 z) Z A B)
 = π² (B Z)
 = π² ((λp.p (λnm.[↑1 n, ↑0 m])) [|0|, |1|])
 = π² ([|0|, |1|] (λnm.[↑1 n, ↑0 m]))
 = π² ((λx.x |0| |1|) (λnm.[↑1 n, ↑0 m]))
 = π² ((λnm.[↑1 n, ↑0 m]) |0| |1|)
 = π² [↑1 |0|, ↑0 |1|]
 = ↑0 |1|
 = (λn.λz01.0 (n z 0 1)) |1|
 = λz01.0 (|1| z 0 1)
 = λz01.0 ((λz01.1 z) z 0 1)
 = λz01.0 (1 z)
 = |2|

This solution is 100% valid, but it is too high-level to compare with my implementation, so I have decided to detect the exact point where the definition fails with "low-level" β-reductions.

Below is the initial expression (succ 1) in De Bruijn index notation:

(λ(λ1(λλ1))(1(λ1(λλλ3)(λλλ13))(λ1(λλλ1((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1)))))(λλλ13)

and its reduction steps (normal order; others won't work either):

(λ(λ1(λλ1))(1(λ1(λλλ3)(λλλ13))(λ1(λλλ1((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1)))))(λλλ13)
 ^          ^                                                                                                       ^^^^^^^
            ↳ becomes λλλ13

(λ1(λλ1))((λλλ13)(λ1(λλλ3)(λλλ13))(λ1(λλλ1((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1))))
 ^^      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ↳ becomes (λλλ13)(λ1(λλλ3)(λλλ13))(λ1(λλλ1((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1)))

(λλλ13)(λ1(λλλ3)(λλλ13))(λ1(λλλ1((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1)))(λλ1)
 ^   ^ ^^^^^^^^^^^^^^^^^
     ↳ becomes λ1(λλλ3)(λλλ13)

(λλ1(λ1(λλλ3)(λλλ13)))(λ1(λλλ1((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1)))(λλ1)
 ^                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - is dropped

(λ1(λ1(λλλ3)(λλλ13)))(λ1(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1)))(λλ1)
 ^^                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ↳ becomes λ1(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1))

(λ1(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1)))(λ1(λλλ3)(λλλ13))(λλ1)
 ^^                                       ^^^^^^^^^^^^^^^^^
  ↳ becomes λ1(λλλ3)(λλλ13)

(λ1(λλλ3)(λλλ13))(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1))(λλ1)
 ^^              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ↳ becomes λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1)

(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1))(λλλ3)(λλλ13)(λλ1)
 ^                                    ^^^^^^ - is dropped

(λλ1((λλλλ1(4321))2)((λλλλ2(4321))1))(λλλ13)(λλ1)
 ^                ^                  ^^^^^^^
                  ↳ becomes λλλ13

(λ1((λλλλ1(4321))(λλλ13))((λλλλ2(4321))1))(λλ1)
 ^^                                    ^
  ↳ becomes (λλ1)                      ↳ becomes (λλ1)

(λλ1)((λλλλ1(4321))(λλλ13))((λλλλ2(4321))(λλ1))
 ^   ^^^^^^^^^^^^^^^^^^^^^^ - is dropped

(λ1)((λλλλ2(4321))(λλ1))
 ^^ ^^^^^^^^^^^^^^^^^^^^
  ↳ becomes (λλλλ2(4321))(λλ1)

(λλλλ2(4321))(λλ1)
 ^     ^     ^^^^^
       ↳ becomes λλ1

λλλ2((λλ1)321)
      ^   ^ - is dropped

λλλ2((λ1)21)
      ^^ ^
       ↳ becomes 2

λλλ2(21) // fail

It's a subtle one: both A and B need to have their tuple in an unnormalized/destructured form in order for the succ and pred definition to work with pure β-reduction evaluators. I noticed this because my implementation used normalized tuples while Bromind's didn't.

Let's see the correct definition:

(λ(λ1(λλ1))(1(λ1(λλλ3)(λλλ13))(λ1(λλ(λλλ132)((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1)))))(λλλ13)

and its reduction steps (normal order again):

(λ(λ1(λλ1))(1(λ1(λλλ3)(λλλ13))(λ1(λλ(λλλ132)((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1)))))(λλλ13)
 ^          ^                                                                                                                   ^^^^^^^
            ↳ becomes λλλ13

(λ1(λλ1))((λλλ13)(λ1(λλλ3)(λλλ13))(λ1(λλ(λλλ132)((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1))))
 ^^      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ↳ becomes (λλλ13)(λ1(λλλ3)(λλλ13))(λ1(λλ(λλλ132)((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1)))

(λλλ13)(λ1(λλλ3)(λλλ13))(λ1(λλ(λλλ132)((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1)))(λλ1)
 ^   ^ ^^^^^^^^^^^^^^^^^
     ↳ becomes λ1(λλλ3)(λλλ13)

(λλ1(λ1(λλλ3)(λλλ13)))(λ1(λλ(λλλ132)((λλλλ2(4321))2)((λλλλ1(4321))2)))(λ1(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1)))(λλ1)
 ^                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - is dropped

(λ1(λ1(λλλ3)(λλλ13)))(λ1(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1)))(λλ1)
 ^^                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ↳ becomes λ1(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1))

(λ1(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1)))(λ1(λλλ3)(λλλ13))(λλ1)
 ^^                                             ^^^^^^^^^^^^^^^^^
  ↳ becomes λ1(λλλ3)(λλλ13)

(λ1(λλλ3)(λλλ13))(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1))(λλ1)
 ^^              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ↳ becomes λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1)

(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1))(λλλ3)(λλλ13)(λλ1)
 ^                       ^                  ^^^^^^
                         ↳ becomes λλλ3

(λ(λλλ132)((λλλλ1(4321))(λλλ3))((λλλλ2(4321))1))(λλλ13)(λλ1)
 ^                                           ^  ^^^^^^^
                                             ↳ becomes λλλ13

(λλλ132)((λλλλ1(4321))(λλλ3))((λλλλ2(4321))(λλλ13))(λλ1)
 ^   ^  ^^^^^^^^^^^^^^^^^^^^^
     ↳ becomes (λλλλ1(4321))(λλλ3)

(λλ1((λλλλ1(4321))(λλλ3))2)((λλλλ2(4321))(λλλ13))(λλ1)
 ^                       ^ ^^^^^^^^^^^^^^^^^^^^^^
                         ↳ becomes (λλλλ2(4321))(λλλ13)

(λ1((λλλλ1(4321))(λλλ3))((λλλλ2(4321))(λλλ13)))(λλ1)
 ^^                                            ^^^^^
  ↳ becomes λλ1

(λλ1)((λλλλ1(4321))(λλλ3))((λλλλ2(4321))(λλλ13))
 ^   ^^^^^^^^^^^^^^^^^^^^^ - is dropped

(λ1)((λλλλ2(4321))(λλλ13))
 ^^ ^^^^^^^^^^^^^^^^^^^^^^
  ↳ becomes (λλλλ2(4321))(λλλ13)

(λλλλ2(4321))(λλλ13)
 ^     ^     ^^^^^^^
       ↳ becomes λλλ13

λλλ2((λλλ13)321)
      ^   ^ ^
          ↳ becomes the right-hand side 3 and gets an index upgrade due to extra abstractions

λλλ2((λλ15)21)
      ^  ^
         ↳ gets its index downgraded

λλλ2((λ14)1)
      ^^  ^
       ↳ becomes the right-hand side 1; 4 gets an index downgrade

λλλ2(13) // aww yiss

The crucial difference can be seen below:

(λλλ1((λλλλ1(4321))2)((λλλλ2(4321))1))(λλλ3)(λλλ13)(λλ1) // normalized tuple
 ^                                    ^^^^^^ - is dropped

(λλ(λλλ132)((λλλλ1(4321))2)((λλλλ2(4321))1))(λλλ3)(λλλ13)(λλ1) // unnormalized tuple
 ^                       ^                  ^^^^^^
                         ↳ becomes λλλ3

If A or B contains a normalized tuple, the zero from the Z tuple is dropped and out of the game, so the crucial step π² ((λnm.[↑1 n, ↑0 m]) |0| |1|) => π² [↑1 |0|, ↑0 |1|] can't fully happen.

Below is the difference between the erroneous (for pure β-reduction evaluators) and correct full (including Z, A, B) definition of succ:

λ PI22 (1 [ZERO, ONE] (λ 1 (λ λ        [SHL0 2,  SHL1 2])) (λ 1 (λ λ        [SHL1 2,  SHL0 1]))) // wrong
λ PI22 (1 [ZERO, ONE] (λ 1 (λ λ TUPLE2 (SHL0 2) (SHL1 2))) (λ 1 (λ λ TUPLE2 (SHL1 2) (SHL0 1)))) // right
    where TUPLE2 ≡ λ λ λ 1 3 2
like image 33
ljedrz Avatar answered Oct 30 '25 10:10

ljedrz