It is actually just a curiosity, but here's how I came up with it.
I could have wondered how to write a sensible Enum
instance
for a type larger than Int
, say (Int, Int)
.
But actually I was wondering: why does Enum
's minimal definition assumes that the possible values of Int
are enough to enumerate the type of which I want to write the Enum
class?
At fromEnum
I do read that
Convert to an
Int
. It is implementation-dependent whatfromEnum
returns when applied to a value that is too large to fit in anInt
.
which I interpret as saying that I can indeed define an Enum
class for a type that doesn't fit Int
. I just have to accept that I'll have to return nonsense form fromEnum
for some values of my type.
But how could succ
and pred
possibly work with such a broken type, given their implementations use toEnum
and fromEnum
?
succ = toEnum . (+ 1) . fromEnum
pred = toEnum . (subtract 1) . fromEnum
Or maybe I should define manually succ
and pred
so that they don't make use toEnum
/fromEnum
?
But if that's the case, then way isn't succ
&pred
a minimal definition?
Yes, I have read the comment from dfeuer here, and the answer, but the question remains as to why I need to define fromEnum
/toEnum
.
I was using Word32
, but then it took me quite a while to undertand the reason of a but in my program. Essentially, quite a good chunk of my logic was assuming that the value of Word32
s is not 0
.
Therefore, I thought of using a more appropriate data type implementing appropriate class
es. I came across Integer.Positive
, but that's only BoundedBelow
, not Bounded
, whereas Word32
is Bounded
, so I decided to write a data
type myself and came up with a module like this:
module Positive32 (Positive32, positive32, getNum) where
import Data.Word (Word32)
newtype Positive32 = Positive32 { getNum :: Word32 } deriving (Enum, Eq, Ord, Show)
instance Bounded Positive32 where
minBound = Positive32 1
maxBound = Positive32 (maxBound :: Word32)
positive32 :: Word32 -> Positive32
positive32 0 = error "Attempt to create `Positive32` from `0`"
positive32 i = Positive32 i
so that
λ> minBound :: Positive32
Positive32 {getNum = 1}
λ> positive32 2
Positive32 {getNum = 2}
λ> positive32 0
Positive32 {getNum = *** Exception: Attempt to create `Positive32` from `0`
CallStack (from HasCallStack):
error, called at ......
However, soon I realized that, even though I have not exported Positive32
's constructor, I can still create Positive32 0
via its derived Enum
instance:
λ> pred $ minBound :: Positive32
Positive32 {getNum = 0}
Hence I tried to write the Enum
instance myself, think it would have a minimal definition succ :: a -> a
and pred :: a -> a
, but in reality it requires toEnum :: Int -> a
and fromEnum :: a -> Int
.
Now, in my specific usecase, that poses no problem, because Word32
fits in Int
, so I can't possibly overflow Int
with Word32
, but I'm still left puzzled with the question of why I can't have an Enum
for types that have more possible values than Int
.
I will wax a little philosophical here, as you're asking a question about the motivations of people writing a spec about three decades ago, which from what I know of human memory, may not even be accurately known to the people that were involved themselves.
But: I don't think, at the time Haskell was being defined, there was a really good picture of whether it would catch on at all or merely be a flash-in-the-pan academic curiosity. There are a number of things in the design that are, therefore, pragmatic decisions made for simplicity of the compiler writers and/or the folks writing essentially toy applications. Certainly nothing was being done at the scale of modern, industrial software engineering. I won't talk about all the decisions that I think differ between those two regimes, but certainly typeclass philosophy is a bit different.
At small scales, large type classes (i.e. classes that encompass multiple concepts and have many methods) are very convenient, in the sense of low boilerplate. Type signatures can be more compact; instance implementations have lighter overhead; more library functions in terms of those classes can be provided; functionality can be more easily shared between method implementations; and programmers don't need to keep track of as many concepts.
Of course, we now know there are also costs to this. When you have a wide variety of data types floating around in your application or in publically available libraries, you find that the larger your typeclass is, the fewer types can instantiate it. At small scales, this matters less, because you can simply export the functionality your type does support and leave it at that. But at large scales, modularity demands that some behaviors be typeclass-polymorphic, and so types that cannot implement those classes lose out on a lot of already-written functionality.
We've seen this play out repeatedly over the history of Haskell. The introduction of Applicative
as a stepping stone between Functor
and Monad
is one example; splitting Monoid
out to have a Semigroup
superclass is another; losing the Eq
superclass on Num
also goes on the list. More generally, I think the closer to the core libraries you are, the more important it is in today's designs that typeclasses be split into as many orthogonal pieces as you can. (Of course, you have to know what orthogonal pieces are useful! It's not at all clear that Applicative
could reasonably have been identified as a useful building block back when making IO
a monad was a fresh, exciting insight.)
And so we come to Enum
. In practice, Enum
serves a variety of roles, including as a de-facto part of the FFI, where toEnum
and fromEnum
are often used for marshalling to C; iterating over numeric ranges; listing inhabitants of your favorite type; as the target of multiple forms of syntactic sugar; and likely more besides. With a modern design, these concerns would likely each be split out into its own class with its own concerns and requirements. But at a small scale, most types that support one of those support essentially all of them, and so it's convenient to lump them together and share implementation work and programmer mindspace.
And if you think Enum
is bad, wait until you start thinking carefully about the numeric hierarchy...
A set of methods being listed as a minimal definition means that the other methods all have default definitions in terms of the minimal set. But the other methods are still part of the typeclass, not separate functions, you certainly can provide your own definitions. Indeed it's reasonably common to see a redundant method provided as a typeclass method specifically on the grounds that a (non-minimal) instance may want to override it. For example mconcat
is documented as:
Fold a list using the monoid.
For most types, the default definition for
mconcat
will be used, but the function is included in the class definition so that an optimized version can be provided for specific types.
But another thing that can happen is that the default definitions are based on assumptions about the behaviour of the minimal set. If you implement the minimal methods in a way that violates those assumptions1, then the default definitions of the other methods will not work and you'll have to actually implement them yourself. A documented "minimal definition" for a class is not claiming that you will never need to implement the other methods.
(Ideally these assumptions would be clearly spelled out in the class' documentation; this is just one more way that Enum
is not an ideal class)
succ
and pred
can't be a minimal definition for Enum
, because you can't implement toEnum
and fromEnum
in terms of just succ
and pred
(okay you could try if you also had minBound
and maxBound
, but it would be awfully inefficient to figure out that fromEnum $ Positive32 4000000000
must be 4000000000
by counting how many times you had to apply succ
to get there from minBound
... and it would probably produce 3999999999
since your minBound
is supposed to be Positive32 1
). Whereas you can provide default definitions of succ
and pred
in terms of fromEnum
and toEnum
, if you make some assumptions.
1 You normally should not violate any type class laws in your instance, and for some classes what the default definitions assume about the minimal is only that the typeclass laws will hold. In that case any lawful minimal instance will be fine. In other cases the default definitions assume more than the typeclass laws, so there are valid instances where the minimal set is not correct without also replacing the defaults for the other methods. Enum
is a fairly lawless class (its docs don't specify any requirements at all unless the type is also Bounded
, and even then they don't require a whole lot) and the default definitions make some quite strict assumptions, so it falls into the latter category.
In this case the assumption that Enum
's default definitions rely on is essentially that if you convert to Int
with fromEnum
and then do the equivalent operations using the Enum Int
instance and convert the results back with toEnum
, you'll get the same thing you would have got if you implemented them more directly. It's not just succ
and pred
; for example this is the default definition of enumFromThenTo
(which is used for the [x1, x2 .. xn]
syntax):
enumFromThenTo x1 x2 y = map toEnum [fromEnum x1, fromEnum x2 .. fromEnum y]
This assumption can seem convenient when it holds. It means that in [x1, x2 .. xn]
, the delta between x1
and x2
can be calculated using Int
subtraction, which can then be repeatedly Int
added to x1
and compared to xn
using Int
ordering. Without those handy Int
operations (if enumFromThenTo
could only be implemented in terms of succ
and pred
) then any delta range would essentially have to step through the entire type one step at a time instead of in jumps of the delta, which would be much less efficient.
But this assumption is really hard to strictly meet. Even something as simple as your Positive32
being a newtype
where Positive32 0
is intended to be invalid breaks the default method assumptions. It even makes the defaults violate one of the few laws Enum
has; pred minBound
is required to be a runtime error, so you must override the default pred
with one that checks that it isn't doing pred (Positive32 1)
. But then you'll also find that [Positive32 5, Positive32 4 ..]
won't correctly stop at Positive32 1
(because it uses enumFromThenTo @Int
, which has no idea about the minBound
of your custom type). So you have to override that method too, and so on. (It also won't respect your maxBound
on platforms where Int
is larger than 32 bits, and when Int
is only 32 bits it won't be able to properly enumerate ranges above the maxBound @Int
)
In fact these defaults are pretty useless. They look like they're intended to simplify derived instances for enumeration types (whose constructors have no fields), since the compiler assigns them a nice contiguous range of Int
where succ
and pred
can just be +1
and subtract 1
. But of course these types are almost always Bounded
and (much) smaller than Int
, so the default enumFrom
and similar wouldn't even work for these! And indeed if you dump the derived instances for something simple like:
data Foo = A | B | C | D | E | F
deriving (Eq, Show, Bounded, Enum)
You'll see that it doesn't make use of all the default method implementations. It generates constants like:
$maxtag_Foo_a1Fu :: GHC.Types.Int
$maxtag_Foo_a1Fu = GHC.Types.I# 5#
so that it can make use of them in generated non-default enumFrom
and enumFromThen
. It does use the default enumFromTo
and enumFromThenTo
, since the user-supplied stop value must be something that's in range, so there's no danger stepping outside the valid range provided the Int
values corresponding to the type are contiguous. If you use non-contiguous or out-of-order Int
s you can get very bogus results, for example:
data Foo = A | B
deriving (Eq, Show)
instance Enum Foo where
fromEnum x = case x of
A -> 1
B -> 10
toEnum x = case x of
1 -> A
10 -> B
n -> error $ show n <> " is out of range"
main :: IO ()
main = do
-- prints 10, without error
print $ length [A .. B]
Essentially just about the only type for which you could actually write a minimal instance defining only fromEnum
and toEnum
is... Int
. Anything larger or in a different order will have range syntax that sometimes errors out or produces incorrect results. For anything smaller and Bounded
, the defaults will break the laws. (Technically if your type doesn't implement Bounded
despite being smaller than Int
, then it's "okay" that [x..]
eventually produces an error when it steps out of range, since it's only required to stop cleanly if the type is Bounded
... and you could even argue that all the dodgy behaviour is allowed if your type is not Bounded
since there are no laws at all for Enum
then, but ugh). Even Enum Word
can't use the minimal instance, despite Word
being exactly the same size as Int
(it's isomorphic to Int
, but the isomorphism isn't fromEnum
/toEnum
), because if it enumerated via Int
it would have problems with upper half of Word
that is past maxBound @Int
.
If your type is smaller than Int
but maps to a contiguous range then you can at least use some of the defaults, but you still need to implement most of the methods. If your type's mapping to Int
is out of order or has holes, then you're completely out of luck.
But there is nothing stopping you from implementing more than just fromEnum
and toEnum
to make your type support range syntax properly. As we've seen, the compiler even needs to do this itself just to properly generate instances for simple enumeration types. If your type is larger than Int
, then fromEnum
will necessarily be partial (or fail to round trip with toEnum
); but if your type is smaller than Int
then toEnum
is necessarily partial, so if that's acceptable then a partial fromEnum
that you simply never use should be just as acceptable. Either way, the few typeclass laws Enum
has don't forbid this situation, so you certainly can have an Enum
instance for a type larger than Int
.
To be opinionated: Enum
is simply not a class designed to the standards we would use today (for the reasons that Danial Wagner's answer discusses; it wasn't necessarily a bad choice in the historical context it was actually made in, but the result is bad today). I tend to think it is best approached purely as a weird form you need to fill out in order to get range syntax working, rather than as actually representing the abstract concept of enumerability.2
With correctly defined Enum
instances all the range syntax works well and predictably (well, apart from some dubious behaviour with floating point numbers). Any behaviour that isn't observable through the range syntax I usually treat as meaningless implementation detail that is accidentally exposed, which I try to avoid depending on. Thus I don't care whether fromEnum
is partial because Int
isn't big enough; converting to Int
is just an implementation detail of not-insanely-inefficient range syntax. If converting to Int
doesn't actually help you do that, just write the least-bad implementation of toEnum
and fromEnum
you can think of, and implement the other methods in terms of something else. The usual way to achieve "correctly defined" Enum
instances is deriving Enum
where that is applicable; if it isn't you almost always will have to override more than just the claimed minimal methods, unfortunately.
2 If we were designing from scratch a class to support the range syntax today I think we probably wouldn't expose toEnum
and fromEnum
at all, we'd instead try to specify the minimal requirements for computing and applying deltas, probably requiring an associated type family (or multi-parameter class) allowing the type used for the delta to be different in each instance. Such a class simply couldn't have been written when Enum
was designed, as the compiler features did not exist.
And I think we would use separate class for the bare minimum of enumerability that is succ
and pred
(or even succ
on its own), and find a way of avoiding them being partial. I can imagine types where you can always generate the next element, but can't efficiently jump through a range in fixed increments, so perhaps if you just provide succ
then [x..]
would work, or even [x..y]
if you also have Eq
, but not [x, y .. z]
.
I haven't really thought it through very hard, but I doubt it would end up looking very much like Enum
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With