Let's say we have a type called d:
type d = D of int * int
And we want to do some pattern matching over it, is it better to do it this way:
let dcmp = function
| D (x, y) when x > y -> 1
| D (x, y) when x < y -> -1
| _ -> 0
or
let dcmp = function
| D (x, y) ->
if x > y then 1 else if x < y then -1 else 0
Just in general is better to match patterns with many "when" cases or to match one pattern and the put an "if-then-else" in it?
And where can I get more information about such matters, like good practices in OCaml and syntactic sugars and such?
Both approaches have their cons and pros so they should be used accordingly to the context.
The when
clause is easier to understand than if
because it has only one branch, so you can digest a branch in a time. It comes with the price that when we analyze a clause in order to understand its path condition we have to analyze all branches before it (and negate them), e.g., compare your variant with the following definition, which is equivalent,
let dcmp = function
| D (x, y) when x > y -> 1
| D (x, y) when x = y -> 0
| _ -> -1
Of course, the same is true for if/then/else
construct it is just harder to accidentally rearrange branches (e.g., during refactoring) in the if/then/else
expression and completely change the logic of the expression.
In addition, the when
guards may prevent the compiler from performing decision tree optimizations1 and confuse2 the refutation mechanism.
Given this, the only advantage to using when
instead of if
in this particular example is that when
syntax looks more appealing as it perfectly lined up and it is easier for the human brain to find where are the conditions and their corresponding values, i.e., it looks more like a truth-table. However, if we will write
let dcmp (D (x,y)) =
if x = y then 0 else
if x > y then 1 else -1
we can achieve the same level of readability.
To summarize, it is better to use when
when it is impossible or nearly impossible to express the same code with if/then/else
. To improve readability it is better to factor your logic into helper functions with readable names. For example, with dcmp
the best solution is to use neither if
or when
, e.g.,
let dcmp (D (x,y)) = compare x y
1)In this particular case the compiler will generate the same code for when
and if/then/else
. But in more general cases, guards may prevent the matching compiler from generating the efficient code, especially when branches are disjoint. In our case, the compiler just noticed that we're repeating the same branch and coalesced them into a single branch and turned it back into the if/then/else expression, e.g., here is the cmm output of the function with the when
guards,
(if (> x y) 3 (if (< x y) -1 1))
which is exactly the same code as generated by the if/then/else version of the dcmp
function.
2) Not to the state where it will not notice a missing branch, of course, but to the state where it will report missing branches less precisely or will ask you to add unnecessary branches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With