[Haskell-cafe] Splitting a list

Wed Apr 21 10:42:32 EDT 2004

I have a list of integers, e.g.:

 [1,5,3,17,8,9]

I want to split it into a pair of lists, with the criterion being that
the sum of the elements in the first list is as large as possible, but
not exceeding a threshold value. For example, if the threshold is 10,
the result should be:

 ([1,5,3],[17,8,9])

and then I want to recursively apply this process to the remainder of
the list, with the end result being a list of lists of integers. Using
the same list along with a threshold of 18, I would get:

 [[1,5,3],[17],[8,9]]

I have devised a means of doing this:

1) Create an auxiliary list of integers, where the n'th element is equal
to the sum of the first n elements of the original list.

2) Zip the auxiliary list with the original list.

3) Use span to break the list in two according to the threshold.

4) Unzip the two resulting lists and discard the auxiliary portions.

5) Repeat from step 1, operating on the tail of the list, until there's
nothing left.

Here's the code that implements this:

runningSum           ::  (Ord a, Num a) => [a] -> [a]
runningSum []        =   []
runningSum (i:[])    =   i : []
runningSum (i:j:js)  =   i : runningSum (i+j : js)

zipWithSum           ::  (Ord a, Num a) => [a] -> [(a,a)]
zipWithSum xs        =   zip (runningSum xs) xs

threshold            ::  (Ord a, Num a) => [a] -> a -> ([(a,a)],[(a,a)])
threshold xs t       =   let test x = (t >= (fst x))
                         in span test (zipWithSum xs)

splitFirst           ::  (Ord a, Num a) => [a] -> a -> ([a],[a])
splitFirst xs t      =   let (ys,zs) = threshold xs t
                         in (snd (unzip ys), snd (unzip zs))

splitAll             ::  (Ord a, Num a) => [a] -> a -> [[a]]
splitAll [] _        =   []
splitAll xs t        =   let (ys, zs) = splitFirst xs t
                         in ys : (splitAll zs t)

(One thing that's missing from this code is a check to verify that no
single element in the list is greater than the threshold, which should
raise an error, rather than get stuck in an infinite loop.)

The algorithm as implemented works fine, but it seems overly complicated
and not very elegant. I get the feeling that I'm missing some obvious
simplification, but I can't find it. Any ideas?

Thanks,

-Steve Schafer