[Haskell-cafe] Unnecessarily strict implementations

Daniel Fischer daniel.is.fischer at web.de
Thu Sep 2 19:32:50 EDT 2010


On Friday 03 September 2010 00:22:14, Jan Christiansen wrote:
> Hi,
>
> On 02.09.2010, at 13:41, Daniel Fischer wrote:
> > takes a little to run and keeps the entire file until the first
> > occurrence
> > of pat in memory.
>
> first of all thanks very much for the detailed instructions.
>
> I have rewritten the example slightly using Strings instead of
> Bytestrings. Replacing all occurrences of 'ä' by "ä" in the
> collected works of Shakespeare ; ) has a maximum memory usage of
> around 65MB with the current implementation of intersperse while it
> has a maximum memory usage of only around 5KB with the less strict
> implementation.

No surprise, there aren't many 'ä's in Shakespeare's works, are there?

>
> I think this is due to the Wadler tuple space leak.

Yup.

> The same would apply to the current implementation of lines.
> I wonder whether an implementation of lines analogous to splitBy
> has any disadvantages.
>

Hardly, but yes. 'break' constructs a pair pretty immediately, so

case break p (x:xs) of
    (pre, post) -> pre : case post of
                           [] -> []
                           (y:ys) -> stuff

can only do harm if (p x) diverges, but then it does.
Currently,
lines (_|_ : rest) = _|_ : _|_
while withe the break, we'd have
lines' (_|_ : rest) = _|_

On the other hand, the current implementation of lines does not seem to 
suffer from Wadler's tuple space leak (according to one test I made), so 
I'd stick with the current implementation for the time being.

> >> That is, we could have
> >>
> >>   intersect [] _|_ = []   and   intersect (_|_:_|_) [] = []
> >>
> >> or
> >>
> >>   intersect [] (_|_:_|_) = []   and   intersect _|_ [] = []
> >>
> >> and the current implementation satisfies neither.
> >
> > Right. So the question is, has the current implementation advantages
> > over
> > either of these? (I don't see any.) If not, which of these two
> > behaviours
> > is preferable?
>
> I'd prefer the first one as it is in line with the left to right
> pattern matching of Haskell.
>

Moi aussi.



More information about the Haskell-Cafe mailing list