[Haskell-cafe] Re: FASTER primes

Daniel Fischer daniel.is.fischer at web.de
Mon Jan 4 12:02:35 EST 2010


Am Montag 04 Januar 2010 13:25:47 schrieb Will Ness:
> Daniel Fischer <daniel.is.fischer <at> web.de> writes:
> > Am Sonntag 03 Januar 2010 09:54:37 schrieb Will Ness:
> > > Daniel Fischer <daniel.is.fischer <at> web.de> writes:
> > > > But there's a lot of list constructuion and deconstruction necessary
> > > > for the Euler sieve.
> > >
> > > yes. Unless, of course, s smart compiler recognizes there's only one
> > > consumer for the values each multiples-list is producing, and keeps it
> > > stripped down to a generator function, and its current value. I keep
> > > thinkig a smart compiler could eliminate all these "span" calls and
> > > replace them with just some pointers manipulating...
> >
> > Of course I'm no smart compiler, but I don't see how it could be even
> > possible to replace the span calls with pointer manipulation when dealing
> > with lazily generated (infinite, if we're really mean) lists. Even when
> > you're dealing only with strict finite lists, it's not trivial to do
> > efficiently.
>
> I keep thinking that storage duplication with span, filter etc. is not
> really necessary, and can be replaced with some pointer chasing -
> especially when there's only one consumer present for the generated values.
>
> What I mean is thinking of lists in terms of produce/consumer paradigm, as
> objects supporting the { pull, peek } interface, keeping the generator
> inside that would produce the next value on 'pull' request and keep it
> ready for any 'peek's.
>
> Euler's sieve is
>
>  sieve (p:ps) xs = h ++ sieve ps (t `minus` map (p*) [p,p+2..])
>                       where (h,t) = span (< p*p) xs

Not quite. That's a variant of the postponed filters, it crosses off e.g. 45 twice, once 
as 3*15 and once as 5*9 (okay, it has already been removed by the first, so let's say 45 
appears in two lists of numbers to be removed if present).
Euler's sieve is never attempting to remove a number more than once, that's the 
outstanding feature of it. Unfortunately, that also makes it hard to implement 
efficiently. The problem is that you can't forget the primes between p and p^2, you must 
remember them for the removal of multiples of p later on.

 An (inefficient but faithful) implementation would be

euler ks@(p:rs) = p : euler (rs `minus` map (*p) ks)

Its performance is really horrible though. A much better implementation is

primes = 2:3:euler hprs [] (scanl (+) 5 (cycle [2,4]))
      where
        hprs = 5:drop 3 primes
        euler (p:ps) acc cs = h ++ euler ps (tail (acc ++ h)) (t `minus` comps)
          where
            (h,t) = span (< p*p) cs
            comps = map (*p) (acc ++ cs)

still really bad. I don't yet see how to eat the cake and have it here.

> > > > I think that remark was meant to apply to composite removal, not
> > > > Turner's sieve.
> > >
> > > It is right there on page 2, right when the Turner's sieve is presented
> > > and discussed. The only explanation that I see is that she thought of
> > > it in regards to the imperative code, just as her analysis concentrates
> > > only on calculation aspects of the imperative code itself.
> >
> > To be fair, she writes:
> >
> > "Let us first describe the original “by hand” sieve algorithm as practiced
> > by Eratosthenes.
> > ...
> > The starting point of p^2 is a pleasing but minor optimization, which can
> > be made
> > because lower multiples will have already been crossed off when we found
> > the primes prior to p.
> > ... (This optimization does not affect the time complexity of the sieve,
> > however, so its absence from the code in Section 1
> > *is not our cause for worry*.)"
>
> A-HA!
>
> But its absense from _*that*_ _*code*_ WAS the *major* cause for worry, as
> dealing with it worked wonders on its complexity and constant factors.

I think you're overinterpreting it. Sure, it's not perfectly worded, but her concern was 
primarily that it's not an Eratosthenes sieve but trial division (and woefully inefficient 
at that), I think. Now, because it's trial division and not Eratosthenes, it has a major 
impact, thus the absence of the optimisation is a reinforcing consequence of the original 
cause for worry.

I may be overdefending her, though, we can't really know what she meant.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100104/005c3ae7/attachment.html


More information about the Haskell-Cafe mailing list