Let's get this finished

Sat Jan 6 21:15:21 EST 2001

qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) wrote,

> Sat, 06 Jan 2001 00:24:14 +0100, Sven Panne <Sven.Panne at informatik.uni-muenchen.de> pisze:
> 
> > For performance there's always #ifdef (well, at least if we
> > consider piping Haskell sources through cpp as "standard"/H98).
> 
> hsc2hs used instead of cpp provides #ifdef too, and avoids lexical
> analysis of Haskell source as C source.
> 
> >     !!!!! The current MarshalUtils uses
> >     !!!!!        peekArray :: Storable a => Ptr a -> Int -> IO [a]
> >     !!!!! but the signature below looks more consistent with peekArray0:
> >     peekArray        :: Storable a =>         Int -> Ptr a -> IO [a]
> >     peekArray0       :: (Storable a, Eq a) => a   -> Ptr a -> IO [a]
> 
> OK.
> 
> >    !!!!! Charset conversions are a diff{erent,icult} story and should
> >    !!!!! be handled by a separate module, so only the well-known
> >    !!!!! ignore-the-upper-half variants are given here.
> 
> I think the conversion story should proceed thus:
> 
> - For now the implementation will not do any conversion.
> 
> - At some point of time all IO and C string handling will switch
>   to do the conversion, keeping the same interface for the case of
>   the default encoding, and providing additional means for using
>   different encodings.
> 
> It is important do introduce the conversion in all places at once,
> otherwise reading a filename containing non-ASCII characters from
> a file and using it to open another file will fail.
> 
> Unfortunately it means that parts of the FFI functionality will have
> to go to ghc's lib/std, to properly handle strings and I/O in standard
> libraries. The more of convenience functions will be kept in hslibs,
> the less fun will be to fix standard libraries.

This is an aspect of GHC's implementation and shouldn't
concern us for the design of the system-independent FFI.
Moreover, the compiler already depends on -package lang
anyway.

> The interface of string handling functions must allow that move
> in future without incompatible changes. We can't have separate
> mallocCString and pokeCString after the conversion switch, because
> the size depends on the contents, because a conversion can change
> the length. So we should not provide them now either.
> 
> When someone really wants to use mallocCString and pokeCString now
> (knowing that there is a little point of doing that in the case of
> conversions), he can use mallocArray0 and pokeArray0, after casting
> characters of the string to [CChar].

To be honest, I don't like this.  It is nice having the
interface such that we can switch to using conversions at
some point, but I still want to be able to conveniently deal
with 8bit characters (because this is what many C libraries
use).  So, I want a fast and convenient interface to 8bit
strings *in addition* to the interface that can deal with
conversions.  In particular this means that I don't want to
deal with CChar in the Haskell interface only to circumvent
conversion. 

> > -- MarshalUtils ------------------------------------------------------
> > 
> >     fromBool   :: Num a => Bool -> a
> >     fromBool = fromIntegral . fromEnum
> 
> I think that
>     fromBool :: Num a => Bool -> a
>     fromBool False = 0
>     fromBool True  = 1
> is more clear (and easier to compile efficiently) :-)

:-)

> >     !!!!! Do we really need this?
> >     indexPtr         :: Storable a => Ptr a -> Int -> Ptr a
> 
> I used it once in Libgr, 3 times in QString and 5 times in
> conversions. I would definitely keep it. Perhaps the name movePtr
> would be better.

How about `advancePtr'?  But I am wondering whether this
shouldn't go into MarshalArray?  It is used for array
access, isn't it?

> Let me repeat a proposal of two functions for MarshalUtils
> (better names are welcome):
> 
>     sequenceCont :: [(a -> res) -> res] -> ([a] -> res) -> res
>     sequenceCont []     cont = cont []
>     sequenceCont (f:fs) cont = f (\x -> sequenceCont fs (\xs -> cont (x:xs)))
> 
>     mapCont :: (a -> (b -> res) -> res) -> [a] -> ([b] -> res) -> res
>     mapCont f = sequenceCont . map f
> 
> Example:
>     mapCont withCString a_list_of_Haskell_strings $
>         \a_list_of_C_string_pointers ->
> 
> Without them one has to write an explicit recursive function each
> time for passing an array of things-to-be-allocated-on-the-stack to
> a C function.

`mapCont' makes a lot of sense.  I am less sure about
`sequenceCont'.  I imagine that you can use it to write
something like

  sequenceCont [withCString aString, withObject anObject] $ \[s, o] ->
    foreignFoo s o

instead of

  withCString aString  $ \s ->
  withObject  anObject $ \o ->
  foreignFoo s o

but is that really useful?

How about calling `mapCont` simply `withMany'.  You would
have

  withMany withCString a_list_of_Haskell_strings $
    \a_list_of_C_string_pointers ->

Or `withList'?

Cheers,
Manuel