[Haskell-cafe] Storables and Ptrs

Mon Dec 6 19:58:54 CET 2010

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/6/10 13:22 , Antoine Latter wrote:
> On Mon, Dec 6, 2010 at 12:03 PM, Tyler Pirtle <teeler at gmail.com> wrote:
>> On Sun, Dec 5, 2010 at 9:46 PM, Antoine Latter <aslatter at gmail.com> wrote:
>>> On Sun, Dec 5, 2010 at 10:45 PM, Tyler Pirtle <teeler at gmail.com> wrote:
>>>> Hi cafe,
>>>>
>>>> I'm just getting into Foreign.Storable and friends and I'm confused
>>>> about the class storable. For GHC, there are instances of storable for
>>>> all kinds of basic types (bool, int, etc) - but I can't find the
>>>> actual declaration of those instances.
>>>>
>>>> I'm confused that it seems that all Storable instances operate on a
>>>> Ptr, yet none of these types allow access to an underlying Ptr. I
>>>> noticed that it's possible via Foreign.Marshal.Utils to call 'new' and
>>>> get a datatype wrapped by a Ptr, but this isn't memory managed - I'd
>>>> have to explicitly free it? Is that my only choice?
>>>
>>> The Storable class defines how to copy a particular Haskell type to or
>>> from a raw memory buffer - specifically represented by the Ptr type.
>>> It is most commonly used when interacting with non-Haskell (or
>>> 'Foreign') code, which is why a lot of the tools look like they
>>> require manual memory management (because foreign-owned resources must
>>> often be managed separately anyway).
>>>
>>> Not all of the means of creating a Ptr type require manual memory
>>> management - the 'alloca' family of Haskell functions allocate a
>>> buffer and then free it automatically when outside the scope of the
>>> passed-in callback (although 'continuation' or 'action' would be the
>>> more Haskell-y way to refer to the idea):
>>>
>>> alloca :: Storable a => (Ptr a -> IO b) -> IO b
>>>
>>> This can be used to call into C code expecting pointer input or output
>>> types to great effect:
>>>
>>> wrapperAroundForeignCode :: InputType -> IO OutputType
>>> wrapperAroundForeignCode in =
>>>  alloca $ \inPtr ->
>>>  alloca $ outPtr -> do
>>>    poke inPtr in
>>>    c_call inPtr outPtr
>>>    peek outPtr
>>>
>>> The functions 'peek' and 'poke' are from the Storable class, and I
>>> used the 'alloca' function to allocate temporary storage for the
>>> pointers I pass into C-land.
>>>
>>> Is there a particular problem you're trying to solve? We might be able
>>> to offer more specific advice. The Storable and Foreign operations may
>>> not even be the best solution to what you're trying to do.
>>>
>>
>>
>> Hey Antoine,
>>
>> Thanks for the clarity, it's very helpful. There is in fact a particular
>> problem I'm trying to solve - persisting data structures. I'm a huge
>> fan of Data.Vector.Storable.MMap, and I'm interested in other things
>> like it - but i realize that the whole thing is built up/on/around
>> storables, and building vectors with storables (read == peek, write ==
>> poke, etc), because i'm trying to write the raw structures themselves
>> to disk (via mmap).
>>
>> I am aware of Data.Binary, but I feel that this kind of serialization
>> for the application I'm building would be too cumbersome considering the
>> number of objects I'm dealing with (on the order of hundreds-of-millions
>> to billions), especially considering that the application I'm building
>> has some very nice pure-ish semantics (an append-only list). I'd
>> like the application to able to simply load a file and interact with
>> that memory - not have to load the file and then deserialize everything.
>>
>> If you have any suggestions here, or if anyone has any general feelings
>> about the design or implementation of Data.Vector.Storable.MMap I'd be
>> very interested in hearing them. Or about any ideas involving persisting
>> native data structures in an append-only fashion, too. ;)
>>
> 
> If you took the approach of Data.Vector.Storable.MMap, every time you
> read an element out of the array you would be un-marshalling the
> object from a pointer into a Haskell type - in effect, making a copy.
> There are probably ways to do this for ByteStrings to make this copy
> free, but that's about it.

IIRC bytestring-mmap uses pinned bytestrings; might be easier/faster to use
that directly if the vector package is troublesome.  You'd want to use the
bytestring internals module for the equivalent of peek/poke.

- -- 
brandon s. allbery     [linux,solaris,freebsd,perl]      allbery at kf8nh.com
system administrator  [openafs,heimdal,too many hats]  allbery at ece.cmu.edu
electrical and computer engineering, carnegie mellon university      KF8NH
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkz9Mm0ACgkQIn7hlCsL25UgKgCgqV/BIXRDm5BVEPBzNllpVVD9
QsYAoJMU7kvHWxoAmb2eYV9b5tll9U0d
=p1GN
-----END PGP SIGNATURE-----