Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copying GHC ByteArray# to Ptr

Tags:

haskell

ghc

I am trying to write the following function:

memcpyByteArrayToPtr :: 
     ByteArray# -- ^ source
  -> Int -- ^ start
  -> Int -- ^ length
  -> Ptr a -- ^ destination
  -> IO ()

The behavior should be to internally use memcpy to copy the contents of a ByteArray# to the Ptr. There are two techniques I have seen for doing something like this, but it's difficult for me to reason about their safety.

The first is found in the memory package. There is an auxiliary function withPtr defined as:

data Bytes = Bytes (MutableByteArray# RealWorld)

withPtr :: Bytes -> (Ptr p -> IO a) -> IO a
withPtr b@(Bytes mba) f = do
    a <- f (Ptr (byteArrayContents# (unsafeCoerce# mba)))
    touchBytes b
    return a

But, I'm pretty sure that this is only safe because the only way to construct Bytes is by using a smart constructor that calls newAlignedPinnedByteArray#. An answer given to a similar question and the docs for byteArrayContents# indicate that it is only safe when dealing with pinned ByteArray#s. In my situation, I'm dealing with the ByteArray#s that the text library uses internally, and they are not pinned, so I believe this would be unsafe.

The second possibility I've stumbled across is in text itself. At the bottom of the Data.Text.Array source code, there is an ffi function memcpyI:

foreign import ccall unsafe "_hs_text_memcpy" memcpyI
  :: MutableByteArray# s -> CSize -> ByteArray# -> CSize -> CSize -> IO ()

This is backed by the following c code:

void _hs_text_memcpy(void *dest, size_t doff, const void *src, size_t soff, size_t n)
{
  memcpy(dest + (doff<<1), src + (soff<<1), n<<1);
}

Because its a part of text, I trust that this is safe. It looks like it's dangerous because is that it's getting a memory location from an unpinned ByteArray#, the very thing that the byteArrayContents# documentation warns against. I suspect that it's ok because the ffi call is marked as unsafe, which I think prevents the GC from moving the ByteArray# during the ffi call.

That's the research I've done far. So far, my best guess is that I can just copy what's been done in text. The big difference would be that, instead of passing in MutableByteArray# and ByteArray# as the two pointers, I would be passing in ByteArray# and Ptr a (or maybe Addr#, I'm not sure which of those you typically use with the ffi).

Is what I have suggested safe? Is there a better way that would allow me to avoid using the ffi? Is there something in base that does this? Feel free to correct any incorrect assumptions I've made, and thanks for any suggestions or guidance.

like image 917
Andrew Thaddeus Martin Avatar asked Sep 06 '25 03:09

Andrew Thaddeus Martin


1 Answers

copyByteArrayToAddr# :: ByteArray# -> Int# -> Addr# -> Int# -> State# s -> State# s

looks like the right primop. You just need to be sure not to try to copy it into memory it occupies. So you should probably be safe with

copyByteArrayToPtr :: ByteArray# -> Int -> Ptr a -> Int -> ST s ()
copyByteArrayToPtr ba (I# x) (Ptr p) (I# y) = ST $ \ s ->
  (# copyByteArrayToAddr# ba x p y s, () #)

Unfortunately, the documentation gives me no clue what each Int# is supposed to mean, but I imagine you can figure that out through trial and segfault.

like image 71
dfeuer Avatar answered Sep 07 '25 20:09

dfeuer