Proposal for Data.List.splitBy

Brent Yorgey byorgey at seas.upenn.edu
Sun Jan 11 18:56:12 EST 2009


> P2. There should be no information loss, that is, keep the delimiters,
> keep the separators, keep the parts of the original list xs that satisfy
> a predicate p, do not lose information about the beginning and the end
> of the list relative to the first and last elements of the list
> respectively.  The user of the function decides what to discard.
> 
> P3. A split list should be unsplittable so as to recover the original
> list xs.  (I made up the word unsplittable.)  (P2 implies P3, but let us
> state this anyway.)

I'm not sure I agree with this.  The problem is that much (most?) of
the time, people looking for a split function want to discard
delimiters; for example, if you have a string like "foo;bar;baz" and
you want to split it into ["foo","bar","baz"].  In this case it's
really annoying to have to throw away the delimiters yourself,
especially if you just get back a list like
["foo",";","bar",";","baz"] and have to decide which things are
delimiters and which aren't, with no help from the type system.  But,
as you noted, throwing away information like this is bad from an
elegance/formal properties point of view.  This is exactly why I
designed the Data.List.Split library as I did: the core internal
splitting function is information-preserving, and by using various
combinators the user can choose to throw away whatever information
they are not interested in.

-Brent


More information about the Libraries mailing list