# Proposal: Make intersperse lazier

Daniel Fischer daniel.is.fischer at web.de
Fri Sep 17 14:17:45 EDT 2010

```On Friday 17 September 2010 14:21:36, Bas van Dijk wrote:
> On Fri, Sep 17, 2010 at 10:48 AM, Christian Maeder
> > ...
> > Therefore I proposed the following implementation:
> >
> > intersperse :: a -> [a] -> [a]
> > intersperse s l = case l of
> >  [] -> l
> >  x : r -> x : if null r then r else
> >    s : intersperse s r
>
> that it applies the static argument transformation: the 'sep' argument
> is brought into scope of the worker function 'go' which then doesn't
> need to pass it to each recursion as in the original implementation.
>
> It would be interesting to see if this provides a noticeable
> performance benefit. Criterion anyone?
>
> Bas

Okay, if you don't then I do :)
I've benchmarked a couple of variants:

module Interspersing where

isgo :: a -> [a] -> [a]
isgo _ [] = []
isgo s (x:xs) = x : go xs
where
go [] = []
go (y:ys) = s : y: go ys

isrec :: a -> [a] -> [a]
isrec s l = case l of
[] -> l
(x:r) -> x : if null r then r else (s : isrec s r)

ispreplc :: a -> [a] -> [a]
ispreplc s (x:xs) =  x : concat [[s,y] | y <- xs]
ispreplc _ _ = []

isprepM :: a -> [a] -> [a]
isprepM s (x:xs) = x : (xs >>= \y -> [s,y])
isprepM _ _ = []

isprepT :: a -> [a] -> [a]
isprepT _ [] = []
isprepT s xs = tail \$ concatMap (\x -> [s,x]) xs

isprepD :: a -> [a] -> [a]
isprepD s xs = case concatMap (\x -> [s,x]) xs of
(_:ys) -> ys
_ -> []

I've also tried with ((s:) . (:[])) instead of the lambdas, but that gives
poor code (calls to (++), much slower).

The explicit worker/wrapper isgo and the list comprehension version
ispreplc give identical core with optimisations (-O and -O2, quite
different with -O0), isprepD gets almost the same core with the obvious
case after the worker, the other isprepX versions get the same worker loop
inside, but with more wrapping. The direct recursion isrec produces
different core, but also with a local function for the recursion.

Benchmark:
intersperse a '.' in a text, then compute the length.
Two texts were tried, one ~1.4 million characters, the other ~6.5 million,
the text was kept in memory for the entire benchmark .

Results:
With -O2:
unsurprisingly, isgo and ispreplc have nearly identical means in each run,
about 33.6 ms for the small benchmark and 153 ms for the large.
isprepD is slightly slower, 33.9 ms resp 155 ms.
isprepM and isprepT are a little slower again, 34.4 ms resp 157 ms.
isrec lags behind, 43.4 ms resp. 193 ms.

With -O1, the outcome is almost identical, isprepD is slightly slower than
with -O2, now identical to isprepM/T.

With -O0, the outcome is drastically different.
isgo produces core almost identical to the optimised and the performance is
indistinguishable.
The top-level recursive isrec suffers camparatively little, the times are
51.3 ms resp. 238.8 ms.
isprepD is again slightly faster than isprepM/T, but now much slower than
isrec, 72.5 ms / 73.5 ms resp. 340 ms / 344 ms.
The list comprehension suffers most and is now the slowest, 85.9 ms resp.
404 ms.

Conclusion:
isprepD/M/T were only included out of curiosity, they didn't give a reason
to choose any of them in the benchmarks.

The top-level recursion is clearly slower than the manual worker/wrapper,
hence it's not the way to go.

Unless a further good implementation is offered, the choice is between the
manual worker/wrapper, isgo, and the list comprehension, ispreplc.

If there's a realistic chance that base is ever compiled without
optimisations, it has to be the explicit worker.

If we can safely assume that base is always compiled with optimisations,
it's a matter of taste.
```