This page here: https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/rts/storage/heap-objects suggests that every object in GHC has a header.
So lets say I have:
data X = X1 Int# | X2 Int# Int#
My understanding is that adding X to a data type effectively takes three words in the case of X1, and four words in the case of X2 (assuming it's evaluated). One for the pointer to either X1 or X2 (which will be tagged with what it is), the header of X1 or X2, and then one or two words.
What is the purpose of the header? It's my understanding that Haskell erases all types at runtime and doesn't do runtime reflection (unless one opts into things like Typeable).
If a function is polymorphic, say f :: a -> a, then f can't actually do anything with a other than copy it around. It can't actually inspect a, so if it needs to copy it around it should just be able to copy the pointer to it.
Now if we've got g :: Num a => a -> a, then we've got a type class dictionary that knows, at compile time, the layout of a. For example, if we do g (42 :: Int), then we know at compile time to make a Num Int dictionary and pass it to g along with 42 :: Int. That Num Int dictionary knows the layout of Int, so again, it won't need to inspect it's header for that information at runtime.
So what is the purpose of the constructor header? How is it used and in what circumstances is it necessary and actually inspected by real code at runtime?
Edit
@chi and @William below has pointed out that if a type has multiple constructors, then a header will be required to differentiate them at runtime. This is a fair point, but then poses the questions:
Most of the key points have been covered in the comments, but the GHC runtime uses constructor headers to support both lazy evaluation (without duplicate evaluation of shared objects) and to support garbage collection.
The lazy evaluation is the harder one to understand, so let's start with that...
Say we have:
data X = X1 Int# | X2 Int# Int#
and suppose we have a function:
foo :: X -> Int
foo (X1 a) = I# a
foo (X2 a b) = I# (a #+ b)
The code for foo needs to handle tagged pointers for objects already evaluated to WHNF, but it also needs to handle thunks. The way GHC handles thunks is by laying them out in memory as:
x_closure:
.quad x_info ; pointer to code to evaluate the thunk
.quad ... ; additional payload, if any, for the code at x_info
and generating code for foo something like:
foo(x_ptr):=
if (*x_ptr & 7) == 0 then x_ptr := (*x_ptr)(x_ptr)
if (*x_ptr & 7 == 1) then ...handle X1...
elif (*x_ptr & 7 == 2) then ...handle X2...
Here, x_ptr := (*x_ptr)(x_ptr) runs the code pointed to by the first word of the thunk object (x_info, in the example above), passing it the pointer as an input argument and updating the pointer based on the returned result. The argument and result conventions are less important than the key fact that, to evaluate a thunk, GHC calls code at the address given by the first word of the thunk.
Now, this foo works fine for fully evaluated X objects, regardless of whether or not they are laid out with a constructor header:
; actual layout used by GHC
x1_closure:
.quad X1_con_info
.quad 42
or without:
; optimized layout proposed by @Clinton
x1_closure:
.quad 42
provided that all pointers are always properly tagged.
The problem is the one already pointed out by @chi in a comment. If we have an object that started life as a untagged pointer to a thunk, when it eventually gets evaluated and the pointer is "upgraded" to a tagged pointer, there may be many copies of the original untagged pointer floating around in various objects. To avoid evaluating thunks multiple times, GHC evaluates the thunk in place, so all those untagged pointers are pointing at the right block of memory, but they are still untagged, so if someone calls foo on such a pointer, it will try to jump to the first word of the "thunk", which is a disaster if the object has been upgraded in place with an optimized layout:
; optimized layout
x1_closure:
.quad 42
On the other hand, it all works seamlessly if the GHC layout is used:
; actual layout used by GHC
x1_closure:
.quad X1_con_info
.quad 42
and the code blocks associated with X1_con_info and X2_con_info just temporarily add the tag to the untagged pointer:
X1_con_info(x_ptr) :=
return(x_ptr+1)
X2_con_info(x_ptr) :=
return(x_ptr+2)
to allow foo to run successfully. Note that this doesn't actually tag the source copy of this untagged pointer, but the garbage collector can do that later -- for each untagged pointer it encounters while traversing objects in memory, it can check the header pointer (e.g., X1_con_info) and check the associated info block, which will show whether or not that header pointer references a constructor, and -- if so -- its appropriate tag.
Now, if you look very carefully at what I just described, you'll see that it would be possible to identify objects that started life in WHNF and have never been referenced by an untagged pointer. If you arranged for all the pointers to point at the payload, with the header coming the word before the payload, I guess you could optimize away headers for this subset of objects, secure in the knowledge that there are no untagged pointers anywhere that could result in an attempt to jump to a non-existent header address.
That would work, I think, but in realistic Haskell code, there aren't many objects that start life in WHNF, so it wouldn't be much of a savings. Also, it would cause problems with garbage collection...
Which brings us to... erm... garbage collection. Sure "types are erased", but not as far as the garbage collector is concerned. It needs to traverse objects without knowledge of the types involved, identifying which fields are pointers and which are, say, unboxed integers. This is easy if every object starts with a header:
; actual layout used by GHC
x1_closure:
.quad X1_con_info
.quad 42
since the GC can consult the block of memory just before the code at X1_con_info (i.e., the actual info block) to get a total field count and identification of pointers and non-pointers.
Without these headers, the only way to accomplish garbage collection would be to actually generate a custom garbage collector with erased types, basically a recursive collection of garbage collection "functions", one function for every type in the application.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With