[Haskell-cafe] Bytestrings vs String? parameters within package names?

Marc Weber marco-oweber at gmx.de
Tue Feb 3 04:50:19 EST 2009


On Mon, Feb 02, 2009 at 10:41:57PM -0500, wren ng thornton wrote:
>  Marc Weber wrote:
> > Should there be two versions?
> > hslogger-bytestring and hslogger-string?
> 
>  I'd just stick with one (with a module for hiding the conversions, as 
>  desired). Duplicating the code introduces too much room for maintenance and 
>  compatibility issues.
> 
>  That's the big thing. The more people that use ByteStrings the less need 
>  there is to convert when combining libraries. That said, ByteStrings aren't 
>  a panacea; lists and laziness are very useful.

Hi wren,

In the second paragraph you agree that there will be less onversion when
using only one type of strings.

You're also right about encoding.
About laziness you'r partially right: There is also Bytestring.Lazy
which is a basically a list of (non lazy) Bytestring

> Duplicating the code introduces too much room for maintenance and 
>  compatibility issues.

I didn't mean duplicating the whole library. I was thinking about a
cabal flag

the cabal file:


  flag bytestring
    Default: False
    Description: enable this to use Bytestrings everywhere instead of
    strings

  [... now libs and executables: ...]

    if flag(bytestring)
      cpp-options: -DUSE_BYTESTRING


An example module

module Example where
#ifdef Strings
  import Data.List as S
#endif
#ifdef USE_BYTESTRINGS
  import Data.ByteString as S
#endif
#ifdef USE_LAZY_BYTESTRINGS
  import Data.ByteString.LAZY as S
#endif
#ifdef USE_UNICODE_BYTESTRING_LIKE_STRINGS
  -- two bytes per char or more? 
  -- they can also be lazy such as Strings however one array element can
  -- have more than one byte
  import Data.Vector as S
#endif

Of course all four modules

  import Data.List as S
  import Data.ByteString as S
  import Data.ByteString.LAZY as S
  import Data.Vector as S

must expose the same API..


Of course cluttering up all files using those ifdefs isn't a nice option
either. But one could move this selection into the cabal file either
depending on one of those (no yet existing) packages:

string-string
string-bytestring
string-utf8-bytestring
string-bytestring
string-bytestring-lazy

Then you could replace one implementation by the other and recompile and
see wether the results differ.

Of course we must take care that we can keep laziness if required.

However using different packages exposing the same API (same modules and
same name will cause trouble if you really have to use both
implementations at some time. I only konw that there has been some
discussion about how to tell ghc to use a module from a particual
package. ..)

So I'd like to propose another way:

{-# LANGUAGE CPP #-}

import Data.STRING as S

and tell .cabal to define STRING representing either of the different
string implementations. I think this would be most portable and you can
additionally import other String modules as well.

So for now I think it would be best if you could teach cabal to change
names depending on flags:

Name: hslogger-${STRING_TYPE}

flag: use_strings
  set STRING_TYPE = String
flag: use_bytestrings
  set STRING_TYPE = Bytestring
.....

Don't think about this issue how it is now or how much effort it would
be to rewrite everything. Think about it how you'd like to work using
haskell in about a year.

Marc


More information about the Haskell-Cafe mailing list