Haskell Platform Proposal: add the 'text' library

Ian Lynagh igloo at earth.li
Wed Sep 8 09:43:30 EDT 2010


On Tue, Sep 07, 2010 at 07:10:27PM -0700, Bryan O'Sullivan wrote:
> Thanks for your comments, Ian. I appreciate your time and care in looking
> this over!

Actually, it's interesting you say that, because I don't think I looked
at the package carefully. I didn't look at the source at all, I briefly
skimmed the haddocks (mostly just to check that it looks like they all
existed, as that's one of the criteria), and I didn't check that the
package API looks sensible and consistent. In fact, the only reason I
looked at the API at all was that I had something to diff it against.

As a comment on the process, perhaps we should require that there are 2
or 3 people who can say that they have used the API (perhaps with hpc
results to see /how much/ they use it), and that it seems sensible (i.e.
they weren't having to work around missing or broken functionality).



Actually, I've just taken a quick look at a random bit of code, and
with Data.Text.Foreign.fromPtr and

init :: Text -> Text
init (Text arr off len) | len <= 0                   = emptyError "init"
                        | n >= 0xDC00 && n <= 0xDFFF = textP arr off (len-2)
                        | otherwise                  = textP arr off (len-1)
    where
      n = A.unsafeIndex arr (off+len-1)

it looks like I can create a Text with length -1 by doing
(init (fromPtr [0xDC00] 1)), which makes me nervous. I wonder if fromPtr
should be renamed unsafeFromPtr. init would still make me nervous,
though.


By the way, fromPtr asserts (len > 0), but from the haddock docs I'd
assume that (fromPtr p 0) is fine.

> > Incidentally, I've just noticed some broken haddock markup for:
> >    I/O libraries /do not support locale-sensitive I\O
> > in
> >
> > http://hackage.haskell.org/packages/archive/text/0.8.0.0/doc/html/Data-Text-IO.html
> 
> 
> Thanks for spotting that. It appears to be due to a Haddock bug,
> unfortunately.

Looking at the source, I'd guess you can work around it by moving the
linebreaks. And it actually looks like 2 haddock bugs: /.../ can't span
lines, and \/ isn't recognised inside /.../. Would be good to get
haddock tickets filed.

> >    Subject to fusion.
> > but I can't see an explanation for the new user of what this means or
> > why they should care.
> 
> 
> That's not quite true: it's actually the very first thing documented:

Sorry, my fault! I read the "Description" at the top, and erroneously
assumed that only function-specific docs would follow the "Synopsis".

> > And replace's docs just say
> >    O(m+n) Replace every occurrence of one substring with another.
> > but should presumably be O(n*m). It's also not necessarily clear what m
> > and n refer to.
> 
> The two parameters to the function?

But replace takes 3 arguments!

The complexity must be at least the second and third multiplied
together, as
    replace "x" (replicate y 'y') (replicate z 'x')
makes y*z words in the heap.

> > > unicode-unaware case conversion (map toUpper is an unsafe case
> 
> > conversion)
> >
> > Surely this is something that should be added to Data.Char, irrespective
> > of whether text is added to the HP?
> 
> Yes, but that's a not-this-problem problem.

Oh, I didn't mean to suggest that you should fix it. I just don't think
it motivates adding the text package to the HP, and thus doesn't belong
in the proposal.

> > > RecordWildCards
> >
> > I'm not a fan, but I fear I may be in the minority.
> 
> It's just used internally, so why do you mind?

I'm sure I'll need to look at the code at some point.

> There are a number of other differences which probably want to be tidied
> > up (mostly functions which are in one package but not the other, and
> > ByteString has IO functions mixed in with the non-IO functions), but
> > those seemed to be the most significant ones. Also,
> >    prefixed :: Text -> Text -> Maybe Text
> > is analogous to
> >    stripPrefix :: Eq a => [a] -> [a] -> Maybe [a]
> > in Data.List
> 
> I hadn't seen that. Hmm. For use with view patterns, I prefer the name I'm
> using right now.

I'd like us to proceed in a way that means we haven't still got
Data.List.stripPrefix and Data.Text.prefixed in the HP in 3 years time.


Thanks
Ian



More information about the Libraries mailing list