Difference between revisions of "Data.List.Split"

From HaskellWiki
Jump to navigation Jump to search
Line 8: Line 8:
 
** use a predicate on elements or sublists instead of giving explicit separators
 
** use a predicate on elements or sublists instead of giving explicit separators
 
** use approximate matching?
 
** use approximate matching?
  +
** chunks of fixed length
 
* how to split?
 
* how to split?
 
** discard the separators
 
** discard the separators
Line 64: Line 65:
 
| e == x = splitOn e xs
 
| e == x = splitOn e xs
 
| otherwise = let (h,t) = break f l in h:(splitEq e t)
 
| otherwise = let (h,t) = break f l in h:(splitEq e t)
  +
  +
  +
-- | split at regular intervals
  +
splitEquidistant :: Int -> [a] -> [[a]]
  +
splitEquidistant _ [] = []
  +
splitEquidistant n xs = y1 : split n y2
  +
where
  +
(y1, y2) = splitAt n xs
   
 
</haskell>
 
</haskell>

Revision as of 19:41, 13 December 2008

A theoretical module which contains implementations/combinators for implementing every possible method of list-splitting known to man. This way no one has to argue about what the correct interface for split is, we can just have them all.

Some possible ways to split a list, to get your creative juices flowing:

  • what to split on?
    • single-element separator
    • sublist separator
    • use a list of possible separators instead of just one
    • use a predicate on elements or sublists instead of giving explicit separators
    • use approximate matching?
    • chunks of fixed length
  • how to split?
    • discard the separators
    • keep the separators with the preceding or following splits
    • keep the separators as their own separate pieces of the result list
    • what to do with separators at the beginning/end? create a blank split before/after, or not?

Add your implementations below! Once we converge on something good we can upload it to hackage.

{-# LANGUAGE ViewPatterns #-}

import Data.List (unfoldr)


-- intercalate :: [a] -> [[a]] -> [a]
-- intercalate x [a,b,c,x,y,z] = [a,x,b,x,c,x,x,y,x,z,x]

-- unintercalate :: [a] -> [a] -> [[a]]
-- unintercalate x [a,x,b,x,c,x,x,y,x,z,x] = [a,b,c,[],y,z]

-- unintercalate is the "inverse" of intercalate

match [] string = Just string
match (_:_) [] = Nothing
match (p:ps) (q:qs) | p == q = match ps qs
match (_:_)  (_:_)  | otherwise = Nothing

chopWith delimiter (match delimiter -> Just tail) = return ([], tail)
chopWith delimiter (c:cs) = chopWith delimiter cs >>= \(head, tail) ->
                              return (c:head, tail)
chopWith delimiter [] = Nothing
-- note: chopWith could be make 'more efficient' i.e. remove the >>=\-> bit
--       by adding an accumulator


unintercalate delimiter = unfoldr (chopWith delimiter)

-- > unintercalate "x" "axbxcxxyxzx"
-- ["a","b","c","","y","z"]

splitOn :: (a -> Bool) -> [a] -> [[a]]
splitOn _ [] = []
splitOn f l@(x:xs)
  | f x = splitOn f xs
  | otherwise = let (h,t) = break f l in h:(splitOn f t)

-- take the element who make predict true as delimiter
-- > splitOn even [1,3,5,6,7,3,3,2,1,1,1]
-- [[1,3,5],[7,3,3],[1,1,1]]

-- | like String split, except for any element that obeys Eq
splitEq :: Eq a -> [a] -> [[a]]
splitEq _ [] = []
splitEq e l@(x:xs)
  | e == x = splitOn e xs
  | otherwise = let (h,t) = break f l in h:(splitEq e t)


-- | split at regular intervals
splitEquidistant :: Int -> [a] -> [[a]]
splitEquidistant _ [] = []
splitEquidistant n xs = y1 : split n y2
  where
    (y1, y2) = splitAt n xs