The Haskell language specification states that it is a non-strict language, but nothing about the evaluation strategy (like when and how an expression is evaluated, and to what level). It does mention the word "evaluate" several times when talking about pattern matching.
I have read a wonderful tutorial about lazy evaluation and weak head normal form, but it is just an implementation strategy of some compiler, which I should not depend on when writing codes.
I come from a strict language background and I just don't feel right if I don't understand how my codes are executed. I wonder why the language specification does not define the evaluation strategy.
I hope someone can enlighten me. Thanks!
I would argue that trying to care about evaluation order is counter productive in Haskell. Not only is the language designed to make evaluation order irrelevant but the evaluation order can jump all over the place in strange and confusing ways. Additionally, the implementation has substantial freedom to execute things differently[1] or to vastly restructure your program[2] if the end result is still the same so different parts of the program might be evaluated in different ways.
The only real restriction is what evaluation strategies you can't use. For example, you can't always use strict evaluation because that would cause valid programs to crash or enter infinite loops.
const 17 undefined
take 10 (repeat 17)
That said, if you really care, one valid strategy you could possibly use to implement all of Haskell is lazy evaluation with thunks. Each value is represented in a box that either contains the value or a thunk subroutine that can be used to compute the value when you finally need to use it.
So when you write
let xs = 1 : []
You would be kind of doing this:
x --> {thunk}
If you never inspect the contents of x then the thunk stays unevaluated. However, if you ever do some pattern matching on the thunk then you need to evaluate the thunk to see what branch you take:
case xs of
  [] -> ...
  (y:ys) -> ...
Now, x's thunk is evaluated and the resulting value is stored in the box in case you ever need it. This is to avoid needing to recompute the thunk. Note that in the following diagram I'm using Cons to stand for the : list constructor
x ---> {Cons {thunk} {thunk}}
               ^       ^
               |       |
               y       ys
Of course, just the presence of the pattern math isn't enough you need first be forced to evaluate that pattern match in the first place. Ultimately, this boils down to your main function needing to print a value or something like that.
Another thing to point out is that we didn't immediately evaluate the contents of the Cons constructor when we first evaluate it. You can check this by running a program that doesn't use the contents of the list:
length [undefined, undefined]
Of course, when we actually use y for something then its corresponding thunk gets evaluated.
Additionally, you can use bang patterns to mark constructor fields as strict so they get immediately evaluated when the constructor gets evaluated (I don't remember if you need to turn an extension for this though)
data LazyBox   = LazyBox Int
data StrictBox = StrictBox !Int
case (LazyBox undefined) of (LazyBox _) -> 17   -- Gives 17
case (StrictBox undefined) of (StrictBox _) -> 17 -- *** Exception: Prelude.undefined
[1]: One important optimization that GHC does is using strict evaluation in the sections that the strictness analyzers determines to be strict.
[2]: One of the most radical examples would be deforestation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With