Name for split

Bart Massey bart at cs.pdx.edu
Thu Jul 17 01:52:43 EDT 2008


The big problem I see with split is the naming, which is already confused to a
certain extent.

It seems to me sensible that just as inits is the list of successive
applications of init, and tails of tails, splits should be the list of
successive applications of split.  The analogy with inits and tails goes
further, though: splits should be the list of all possible ways to generate
contiguous bipartitions of a given list.  Ironically, I was looking for just
this function earlier today, while thinking about generating permutations.

These ideas together suggest that

  split :: ([a], [a]) -> ([a], [a])
  split (_, []) = error "split of empty list"
  split (l, r : rs) = (l ++ [r], rs)

  splits :: [a] -> [([a], [a])]
  splits l = splits' ([], l) where
    splits' l@(_, []) = [l]
    splits' l = l : splits' (split l)

(I can rewrite splits using unfoldr, but to my mind it looks even worse.  Is
there some way to get the reflexive transitive closure of a function defined
over a Maybe range, using fix or somesuch, that I'm missing?)

We thus have, for example

  split [("1", "23")] == [("12", "3")]
  splits "123" == [("","123"),("1","23"),("12","3"),("123","")]

This also dovetails nicely with what splitAt does, via the identity

  splitAt i l == splits l !! i

(for i in the domain of !!).

The function everyone has been asking for, it seems to me, is actually the right
inverse of intercalate.  This suggests that it might be named "deintercalate",
which would be awesome assuming we hate each other and ourselves.
"unintercalate" is even worse, since "intercalate" is the "unwords" analog and
"unintercalate" would be the "words" analog.  (How did that happen, anyway?)

Microsoft Encarta Thesaurus suggests
(http://au.encarta.msn.com/thesaurus_1861859438/intercalate.html) "extrapolate"
as an antonym for "intercalate", but that doesn't even seem correct, much less
sensible in this context.  Perhaps "striate" would be acceptable?

Note that the right thing sort of happens here: we have (for the right
definition of striate) that

  intercalate sep . striate sep 

is the identity on lists, which seems like a good thing. I'd suggest that a
reasonable definition of striate might be

  striate :: (Eq a) => [a] -> [a] -> [[a]]
  striate _ [] = [[]]
  striate sep l | isPrefixOf sep l = [] : striate sep (drop (length sep) l)
  striate sep (e  : es) = (e : l') : ls' where
      (l' : ls') = striate sep es

which seems to me pretty natural except for that odd definition on the empty
list, but I think that's arguably a feature.  (Probably there's some clever fold
or something I'm missing here.  Oh well.)

Once one has all the split machinery for contiguous bipartions represented as
two-tuples of lists, of course, one might naturally start to think about
generalizing the whole mess to contiguous k-partitions or just contiguous
partitions, represented as lists of lists.  But I think I'm done for now.

I'm sure everyone is relieved.

    Bart Massey
    bart at cs.pdx.edu




More information about the Libraries mailing list