Prime numbers
From HaskellWiki
(→Unbounded Euler's sieve) 
(→From Squares: simplify code) 

(51 intermediate revisions by one user not shown)  
Line 20:  Line 20:  
In mathematics, ''amongst the natural numbers greater than 1'', a ''prime number'' (or a ''prime'') is such that has no divisors other than itself (and 1). The smallest prime is thus 2. Nonprime numbers are known as ''composite'', i.e. those representable as product of two natural numbers greater than 1. The set of prime numbers is thus 
In mathematics, ''amongst the natural numbers greater than 1'', a ''prime number'' (or a ''prime'') is such that has no divisors other than itself (and 1). The smallest prime is thus 2. Nonprime numbers are known as ''composite'', i.e. those representable as product of two natural numbers greater than 1. The set of prime numbers is thus 

−  : '''P''' = {''n'' ∈ '''N'''<sub>2</sub> ''':''' (∀ ''m'' ∈ '''N'''<sub>2</sub>) ((''m''  ''n'') ⇒ m = n)} 
+  : '''P''' = {''n'' ∈ '''N'''<sub>2</sub> ''':''' (∀ ''m'' ∈ '''N'''<sub>2</sub>) ((''m''  ''n'') ⇒ m = n)} 
:: = {''n'' ∈ '''N'''<sub>2</sub> ''':''' (∀ ''m'' ∈ '''N'''<sub>2</sub>) (''m''×''m'' ≤ ''n'' ⇒ ¬(''m''  ''n''))} 
:: = {''n'' ∈ '''N'''<sub>2</sub> ''':''' (∀ ''m'' ∈ '''N'''<sub>2</sub>) (''m''×''m'' ≤ ''n'' ⇒ ¬(''m''  ''n''))} 

−  :: = '''N'''<sub>2</sub> \ {''n''×''m'' ''':''' ''n'',''m'' ∈ '''N'''<sub>2</sub>} = 
+  :: = '''N'''<sub>2</sub> \ {''n''×''m'' ''':''' ''n'',''m'' ∈ '''N'''<sub>2</sub>} 
−  :: = '''N'''<sub>2</sub> \ '''∪''' { {''n''×''m'' ''':''' ''m'' ∈ '''N'''<sub>n</sub>} ''':''' ''n'' ∈ '''N'''<sub>2</sub> } 
+  :: = '''N'''<sub>2</sub> \ '''⋃''' { {''n''×''m'' ''':''' ''m'' ∈ '''N'''<sub>n</sub>} ''':''' ''n'' ∈ '''N'''<sub>2</sub> } 
−  :: = '''N'''<sub>2</sub> \ '''∪''' { {''n''×''n'', ''n''×''n''+''n'', ''n''×''n''+2×''n'', ...} ''':''' ''n'' ∈ '''N'''<sub>2</sub> } 
+  :: = '''N'''<sub>2</sub> \ '''⋃''' { {''n''×''n'', ''n''×''n''+''n'', ''n''×''n''+2×''n'', ...} ''':''' ''n'' ∈ '''N'''<sub>2</sub> } 
−  :: = '''N'''<sub>2</sub> \ '''∪''' { {''n''×''n'', ''n''×''n''+''n'', ''n''×''n''+2×''n'', ...} ''':''' ''n'' ∈ '''P''' } where '''N'''<sub>k</sub> = {''n'' ∈ '''N''' ''':''' ''n'' ≥ k}. 
+  :: = '''N'''<sub>2</sub> \ '''⋃''' { {''n''×''n'', ''n''×''n''+''n'', ''n''×''n''+2×''n'', ...} ''':''' ''n'' ∈ '''P''' } 
+  :::: where '''N'''<sub>k</sub> = {''n'' ∈ '''N''' ''':''' ''n'' ≥ k} 

Thus starting with 2, for each newly found prime we can ''eliminate'' from the rest of the numbers ''all the multiples'' of this prime, giving us the next available number as next prime. This is known as ''sieving'' the natural numbers, so that in the end all the composites are eliminated and what we are left with are just primes. <small>(This is what the last formula is describing, though seemingly [http://en.wikipedia.org/wiki/Impredicativity impredicative], because it is selfreferential. But because '''N'''<sub>2</sub> is wellordered (with the order being preserved under addition), the formula is welldefined.)</small> 
Thus starting with 2, for each newly found prime we can ''eliminate'' from the rest of the numbers ''all the multiples'' of this prime, giving us the next available number as next prime. This is known as ''sieving'' the natural numbers, so that in the end all the composites are eliminated and what we are left with are just primes. <small>(This is what the last formula is describing, though seemingly [http://en.wikipedia.org/wiki/Impredicativity impredicative], because it is selfreferential. But because '''N'''<sub>2</sub> is wellordered (with the order being preserved under addition), the formula is welldefined.)</small> 

Line 40:  Line 40:  
== Sieve of Eratosthenes == 
== Sieve of Eratosthenes == 

−  The sieve of Eratosthenes calculates primes as natural numbers above 1 without the composites, where composites are found as sequences of multiples of each prime, generated by counting up from the prime's square in constant increments equal to that prime (or twice that much, for odd primes): 
+  The sieve of Eratosthenes calculates primes as ''integers above 1 that are not multiples of primes'', i.e. ''not composite'' — whereas composites are found as a union of sequences of multiples of each prime, generated by counting up from the prime's square in constant increments equal to that prime (or twice that much, for odd primes): 
<haskell> 
<haskell> 

−  primes = 2 : 3 : ([5,7..] `minus` unionAll [[p*p, p*p+2*p..]  p < tail primes]) 
+  primes = 2 : 3 : ([5,7..] `minus` _U [[p*p, p*p+2*p..]  p < tail primes]) 
+  
+   Using `under n = takeWhile (< n)`, `minus` and `_U` satisfy 

+   under n (minus a b) == under n a \\ under n b 

+   under n (union a b) == nub . sort $ under n a ++ under n b 

+   under n . _U == nub . sort . concat . map (under n) 

</haskell> 
</haskell> 

−  <code>unionAll</code> can be defined as an evergrowing, rightdeepening unbounded [[#Tree_mergingtree of comparisons]] like the <code>joinT</code> function below, or it can use a (fixedsized) array as a segmentwise sorting and duplicatesremoving device, to join together the streams of multiples of primes. The processing naturally divides up into segments between the successive squares of primes. 
+  The definition is primed with 2 and 3 as initial primes, to avoid the vicious circle. 
+  
+  <code>_U</code> can be defined as a folding of <code>(\(x:xs) > (x:) . union xs)</code>, or it can use an <code>Bool</code> array as a sorting and duplicatesremoving device. The processing naturally divides up into segments between the successive squares of primes. 

Stepwise development follows (the fully developed version is [[#Tree merging with Wheelhere]]). 
Stepwise development follows (the fully developed version is [[#Tree merging with Wheelhere]]). 

Line 55:  Line 55:  
<haskell> 
<haskell> 

 genuine yet wasteful sieve of Eratosthenes 
 genuine yet wasteful sieve of Eratosthenes 

−  primesTo m = 2 : eratos [3,5..m] where 
+  primesTo m = eratos [2..m] where 
eratos [] = [] 
eratos [] = [] 

−  eratos (p:xs) = p : eratos (xs `minus` [p, p+2*p..m]) 
+  eratos (p:xs) = p : eratos (xs `minus` [p, p+p..m]) 
−   eulers (p:xs) = p : eulers (xs `minus` map (p*)(p:xs)) 
+   eratos (p:xs) = p : eratos (xs `minus` map (p*) [1..m]) 
+   eulers (p:xs) = p : eulers (xs `minus` map (p*) (p:xs)) 

 turner (p:xs) = p : turner [x  x < xs, rem x p /= 0] 
 turner (p:xs) = p : turner [x  x < xs, rem x p /= 0] 

</haskell> 
</haskell> 

−  This should be regarded more like a ''specification'', not a code. It is extremely slow, running at empirical time complexities worse than quadratic in number of primes produced. But it has the core defining features of the classical formulation of S. of E. as '''''a.''''' being bounded, i.e. having a top limit value, and '''''b.''''' finding out the multiples of a prime directly, by counting up from it in increments equal to that prime (or twice that, for odd primes). 
+  This should be regarded more like a ''specification'', not a code. It runs at [https://en.wikipedia.org/wiki/Analysis_of_algorithms#Empirical_orders_of_growth empirical orders of growth] worse than quadratic in number of primes produced. But it has the core defining features of the classical formulation of S. of E. as '''''a.''''' being bounded, i.e. having a top limit value, and '''''b.''''' finding out the multiples of a prime directly, by counting up from it in constant increments, equal to that prime. 
−  The canonical listdifference <code>minus</code> and duplicatesremoving listunion <code>union</code> functions dealing with ordered increasing lists  infinite as well as finite  are simple enough to define (cf. Leon P.Smith's [http://hackage.haskell.org/packages/archive/dataordlist/latest/doc/html/DataListOrdered.html Data.List.Ordered] package): 
+  The canonical listdifference <code>minus</code> and duplicatesremoving <code>union</code> functions dealing with ordered increasing lists (cf. [http://hackage.haskell.org/packages/archive/dataordlist/latest/doc/html/DataListOrdered.html Data.List.Ordered package]) are: 
<haskell> 
<haskell> 

 ordered lists, difference and union 
 ordered lists, difference and union 

Line 84:  Line 84:  
=== Analysis === 
=== Analysis === 

−  So for each newly found prime the sieve removes its ''odd'' multiples from further consideration. It finds them by counting up in steps of ''2p''. There are thus <math>O(m/p)</math> multiples generated and eliminated for each prime, and <math>O(m \log \log(m))</math> multiples in total, with duplicates, by virtues of [http://en.wikipedia.org/wiki/Prime_harmonic_series prime harmonic series]. 
+  So for each newly found prime the sieve discards its multiples, enumerating them by counting up in steps of ''p''. There are thus <math>O(m/p)</math> multiples generated and eliminated for each prime, and <math>O(m \log \log(m))</math> multiples in total, with duplicates, by virtues of [http://en.wikipedia.org/wiki/Prime_harmonic_series prime harmonic series]. 
If each multiple is dealt with in <math>O(1)</math> time, this will translate into <math>O(m \log \log(m))</math> RAM machine operations (since we consider addition as an atomic operation). Indeed mutable randomaccess arrays allow for that. But lists in Haskell are sequentialaccess, and complexity of <code>minus(a,b)</code> for lists is <math>\textstyle O(a \cup b)</math> instead of <math>\textstyle O(b)</math> of the direct access destructive array update. The lower the complexity of each ''minus'' step, the better the overall complexity. 
If each multiple is dealt with in <math>O(1)</math> time, this will translate into <math>O(m \log \log(m))</math> RAM machine operations (since we consider addition as an atomic operation). Indeed mutable randomaccess arrays allow for that. But lists in Haskell are sequentialaccess, and complexity of <code>minus(a,b)</code> for lists is <math>\textstyle O(a \cup b)</math> instead of <math>\textstyle O(b)</math> of the direct access destructive array update. The lower the complexity of each ''minus'' step, the better the overall complexity. 

Line 94:  Line 94:  
=== From Squares === 
=== From Squares === 

−  But we can start each step at a prime's square, as its smaller multiples will have been already produced on previous steps. This means we can stop early, when the prime's square reaches the top value ''m'', and thus ''cut the total number of steps'' to around <math>\textstyle n = \pi(m^{0.5}) = \Theta(2m^{0.5}/\log m)</math>. This does not in fact change the complexity of randomaccess code, but for lists it makes it <math>O(m^{1.5}/(\log m)^2)</math>, or <math>O(n^{1.5}/(\log n)^{0.5})</math> in ''n'' primes produced, a dramatic speedup: 
+  But we can start each elimination step at a prime's square, as its smaller multiples will have been already produced and discarded on previous steps, as multiples of smaller primes. This means we can stop early now, when the prime's square reaches the top value ''m'', and thus cut the total number of steps to around <math>\textstyle n = \pi(m^{0.5}) = \Theta(2m^{0.5}/\log m)</math>. This does not in fact change the complexity of randomaccess code, but for lists it makes it <math>O(m^{1.5}/(\log m)^2)</math>, or <math>O(n^{1.5}/(\log n)^{0.5})</math> in ''n'' primes produced, a dramatic speedup: 
<haskell> 
<haskell> 

−  primesToQ m = 2 : sieve [3,5..m] where 
+  primesToQ m = eratos [2..m] where 
−  sieve [] = [] 
+  eratos [] = [] 
−  sieve (p:xs) = p : sieve (xs `minus` [p*p, p*p+2*p..m]) 
+  eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+p..m]) 
+   eratos (p:xs) = p : eratos (xs `minus` map (p*) [p..div m p]) 

+   eulers (p:xs) = p : eulers (xs `minus` map (p*) (takeWhile (<= div m p) (p:xs))) 

+   turner (p:xs) = p : turner [x  x<xs, x<p*p  rem x p /= 0] 

</haskell> 
</haskell> 

−  Its empirical complexity is about <math>O(n^{1.45})</math>. This simple optimization works here because our formulation is bounded, as is the original algorithm. To start late on a bounded sequence is to stop early (if we start past its end we don't need to start at all – ''see warning below''), thus preventing the creation of all the superfluous multiples streams which started above the upper bound anyway. This is acceptably slow now, striking a good balance between clarity, succinctness and efficiency. 
+  Its empirical complexity is around <math>O(n^{1.45})</math>. This simple optimization works here because this formulation is bounded (by an upper limit). To start late on a bounded sequence is to stop early (starting past end makes an empty sequence – ''see warning below''<sup>1</sup>), thus preventing the creation of all the superfluous multiples streams which start above the upper bound anyway <small>(note that Turner's sieve is unaffected by this)</small>. This is acceptably slow now, striking a good balance between clarity, succinctness and efficiency. 
−  <small>''Warning'': this is predicated on a subtle point of <code>minus xs [] = xs</code> definition being used, as it indeed should be. If the definition <code>minus (x:xs) [] = x:minus xs []</code> is used, the problem is back and the complexity is bad again.</small> 
+  <sup>1</sup><small>''Warning'': this is predicated on a subtle point of <code>minus xs [] = xs</code> definition being used, as it indeed should be. If the definition <code>minus (x:xs) [] = x:minus xs []</code> is used, the problem is back and the complexity is bad again.</small> 
=== Guarded === 
=== Guarded === 

Line 112:  Line 112:  
 p*p > m = p : xs 
 p*p > m = p : xs 

 otherwise = p : sieve (xs `minus` [p*p, p*p+2*p..]) 
 otherwise = p : sieve (xs `minus` [p*p, p*p+2*p..]) 

+   p : sieve (xs `minus` map (p*) [p,p+2..]) 

+   p : eulers (xs `minus` map (p*) (p:xs)) 

</haskell> 
</haskell> 

−  It is now clear that it can't be made unbounded just by abolishing the upper bound ''m'', because the guard can not be simply omitted without changing the complexity back for the worst. 
+  (here we also dismiss all evens above 2 a priori.) It is now clear that it ''can't'' be made unbounded just by abolishing the upper bound ''m'', because the guard can not be simply omitted without changing the complexity back for the worst. 
=== Accumulating Array === 
=== Accumulating Array === 

−  So while <code>minus(a,b)</code> takes <math>O(b)</math> operations for randomaccess imperative arrays and about <math>O(a)</math> operations for lists here, using Haskell's immutable array for ''a'' one ''could'' expect the array update time to be nevertheless closer to <math>O(b)</math> if destructive update were used implicitly by compiler for an array being passed along as an accumulating parameter: 
+  So while <code>minus(a,b)</code> takes <math>O(b)</math> operations for randomaccess imperative arrays and about <math>O(a)</math> operations here for ordered increasing lists of numbers, using Haskell's immutable array for ''a'' one ''could'' expect the array update time to be nevertheless closer to <math>O(b)</math> if destructive update were used implicitly by compiler for an array being passed along as an accumulating parameter: 
<haskell> 
<haskell> 

{# OPTIONS_GHC O2 #} 
{# OPTIONS_GHC O2 #} 

Line 131:  Line 133:  
</haskell> 
</haskell> 

−  This indeed seems to be working for unboxed arrays, with the type signature added explicitly <small>(suggested by Daniel Fischer)</small>, the above code running relatively '''very fast''', with empirical complexity of about '''<math>O(n^{1.15..1.45})</math>''' in ''n'' primes produced (for producing from few hundred thousands to few millions primes, memory usage also slowly growing). But it relies on specific compiler optimizations, and indeed if we remove the explicit type signature, the code above turns ''very'' slow. 
+  Indeed for unboxed arrays, with the type signature added explicitly <small>(suggested by Daniel Fischer)</small>, the above code runs pretty fast, with empirical complexity of about ''<math>O(n^{1.15..1.45})</math>'' in ''n'' primes produced (for producing from few hundred thousands to few millions primes, memory usage also slowly growing). But it relies on specific compiler optimizations, and indeed if we remove the explicit type signature, the code above turns ''very'' slow. 
How can we write code that we'd be certain about? One way is to use explicitly mutable monadic arrays ([[#Using Mutable Arrays''see below'']]), but we can also think about it a little bit more on the functional side of things still. 
How can we write code that we'd be certain about? One way is to use explicitly mutable monadic arrays ([[#Using Mutable Arrays''see below'']]), but we can also think about it a little bit more on the functional side of things still. 

=== Postponed === 
=== Postponed === 

−  Going back to ''guarded'' Eratosthenes, first we notice that though it works with minimal number of prime multiples streams, it still starts working with each a bit prematurely. Fixing this with explicit synchronization won't change complexity but will speed it up some more: 
+  Going back to ''guarded'' Eratosthenes, first we notice that though it works with minimal number of prime multiples streams, it still starts working with each prematurely. Fixing this with explicit synchronization won't change complexity but will speed it up some more: 
+  <haskell> 

+  primesPE1 = 2 : sieve [3..] primesPE1 

+  where 

+  sieve xs (p:ps)  q < p*p , (h,t) < span (< q) xs = 

+  h ++ sieve (t `minus` [q, q+p..]) ps 

+   h ++ turner [xx<t, rem x p /= 0] ps 

+  </haskell> 

+  <! primesPE1 = 2 : sieve [3,5..] 9 (tail primesPE1) 

+  where 

+  sieve xs q ~(p:t)  (h,ys) < span (< q) xs = 

+  h ++ sieve (ys `minus` [q, q+2*p..]) (head t^2) t 

+   h ++ turner [x  x < ys, rem x p /= 0] ... 

+  > 

+  Inlining and fusing <code>span</code> and <code>(++)</code> we get: 

<haskell> 
<haskell> 

primesPE = 2 : primes' 
primesPE = 2 : primes' 

Line 145:  Line 147:  
 otherwise = sieve (xs `minus` [q, q+2*p..]) (head t^2) t 
 otherwise = sieve (xs `minus` [q, q+2*p..]) (head t^2) t 

</haskell> 
</haskell> 

−  Since the removal of a prime's multiples here starts at the right moment, and not just from the right place, the code could '''''now''''' finally be made unbounded. Because no multiplesremoval process is started ''prematurely'', there are no ''extraneous'' multiples streams, which were the reason for the extreme wastefulness and thus inefficiency of the original formulation. 
+  Since the removal of a prime's multiples here starts at the right moment, and not just from the right place, the code can now finally be made unbounded. Because no multiplesremoval process is started ''prematurely'', there are no ''extraneous'' multiples streams, which were the reason for the original formulation's extreme inefficiency. 
=== Segmented === 
=== Segmented === 

Line 177:  Line 179:  
((++ [(2*p,q)]) [(s,qrem (qy) s)  (s,y) < fs]) 
((++ [(2*p,q)]) [(s,qrem (qy) s)  (s,y) < fs]) 

−  joinST (x:xs) = union x (joinST (pairs xs)) where 
+  joinST (xs:t) = (union xs . joinST . pairs) t where 
−  pairs (x:y:xs) = union x y : pairs xs 
+  pairs (xs:ys:t) = union xs ys : pairs t 
−  pairs xs = xs 
+  pairs t = t 
joinST [] = [] 
joinST [] = [] 

</haskell> 
</haskell> 

=== Linear merging === 
=== Linear merging === 

−  But segmentation doesn't add anything substantially, and each multiples stream starts at its prime's square anyway. What does the [[#PostponedPostponed]] code do, operationally? For each prime's square passed over, it builds up a nested linear ''leftdeepening'' structure, '''''(...((xsa)b)...)''''', where '''''xs''''' is the original oddsproducing ''[3,5..]'' list, so that each odd it produces must go through more and more <code>`minus`</code> nodes on its way up  and those odd numbers that eventually emerge on top are prime. Thinking a bit about it, wouldn't another, ''rightdeepening'' structure, '''''(xs(a+(b+...)))''''', be better? This idea is due to Richard Bird (in the code presented in Melissa O'Neill's article). 
+  But segmentation doesn't add anything substantially, and each multiples stream starts at its prime's square anyway. What does the [[#PostponedPostponed]] code do, operationally? For each prime's square passed over, it builds up a nested linear ''leftdeepening'' structure, '''''(...((xsa)b)...)''''', where '''''xs''''' is the original oddsproducing ''[3,5..]'' list, so that each odd it produces must go through more and more <code>`minus`</code> nodes on its way up  and those odd numbers that eventually emerge on top are prime. Thinking a bit about it, wouldn't another, ''rightdeepening'' structure, '''''(xs(a+(b+...)))''''', be better? This idea is due to Richard Bird, seen in his code presented in M. O'Neill's article, equivalent to: 
−  
<haskell> 
<haskell> 

−  primesLME1 = 2 : primes' 
+  primesB = (2:) . minus [3..] . foldr (\p r> p*p : union [p*p+p, p*p+2*p..] r) [] 
−  where 
+  $ primesB 
−  primes' = 3 : ([5,7..] `minus` joinL [[p*p, p*p+2*p..]  p < primes']) 

−  joinL ((x:xs):t) = x : union xs (joinL t) 

</haskell> 
</haskell> 

−  +  or, 

−  Here, ''xs'' stays near the top, and ''more frequently'' oddsproducing streams of multiples of ''smaller'' primes are above those of the ''bigger'' primes, that produce ''less frequently'' their candidates which have to pass through ''more'' <code>`union`</code> nodes on their way up. Plus, no explicit synchronization is necessary anymore because the produced multiples of a prime start at its square anyway  just some care has to be taken to avoid a runaway access to the indefinitelydefined structure (specifically, if each <code>(+)/union</code> operation passes along unconditionally its left child's head value ''before'' polling the right child's head value, then we are safe). 

−  
−  To prevent unneeded memoization and thus prevent a memory leak, double primes feed can be introduced as per Melissa O'Neill's code. Here's the code, faster yet but still with the same empirical orders of growth of about ~ <math> n^{1.4}</math>: 

<haskell> 
<haskell> 

−  primesLME = 2 : ([3,5..] `minus` joinL [[p*p, p*p+2*p..]  p < primes']) 
+  primesLME1 = 2 : primes' where 
−  where 

primes' = 3 : ([5,7..] `minus` joinL [[p*p, p*p+2*p..]  p < primes']) 
primes' = 3 : ([5,7..] `minus` joinL [[p*p, p*p+2*p..]  p < primes']) 

Line 198:  Line 200:  
</haskell> 
</haskell> 

−  This allows the <code>primesLME</code> stream to be discarded immediately as it is being consumed by its consumer. With <code>primes'</code> it is impossible, as each produced member of <code>primes'</code> is needed later as input to the same <code>primes'</code> stream definition. So the <code>primes'</code> stream feeds in a loop to itself, and thus is retained in memory; and it is also used as input stream for the main stream production, but it reaches only up to a square root of where the main stream, <code>primesLME</code>, reaches, as its production is demanded by its consumer(s). 
+  Here, ''xs'' stays near the top, and ''more frequently'' oddsproducing streams of multiples of ''smaller'' primes are above those of the ''bigger'' primes, that produce ''less frequently'' their candidates which have to pass through ''more'' <code>`union`</code> nodes on their way up. Plus, no explicit synchronization is necessary anymore because the produced multiples of a prime start at its square anyway  just some care has to be taken to avoid a runaway access to the indefinitelydefined structure (specifically, if each <code>(+)/union</code> operation passes along unconditionally its left child's head value ''before'' polling the right child's head value, then we are safe). 
−  === Tree merging === 
+  To prevent unneeded memoization (a memory leak), Melissa O'Neill introduces double primes feed in her ZIP file's code. We can even do multistage. Here's the code, faster still and with radically reduced memory consumption, with empirical orders of growth of about ~ <math> n^{1.25..1.40}</math>: 
−  Moreover, it can be changed into a '''''tree''''' structure. This idea is due to Dave Bayer on haskellcafe mailing list (though in much more complex formulation): 

<haskell> 
<haskell> 

−  primesTME = 2 : (gaps 3 $ joinT [[p*p, p*p+2*p..]  p < primes']) 
+  primesLME = 2 : _Y ((3:) . minus [5,7..] . joinL . map (\p> [p*p, p*p+2*p..])) 
−  where 
+  
−  primes' = 3 : (gaps 5 $ joinT [[p*p, p*p+2*p..]  p < primes']) 
+  _Y :: (t > t) > t 
−  +  _Y g = g (_Y g)  multistage, recursive 

−  joinT ((x:xs):t) = x : union xs (joinT (pairs t)) 
+   g x where x = g x  two stages 
−  pairs ((x:xs):ys:t) = (x : union xs ys) : pairs t 

−  gaps k s@(x:xs)  k<x = k:gaps (k+2) s  equivalent to 

−   True = gaps (k+2) xs  [k,k+2..]`minus`s, k<=head s 

</haskell> 
</haskell> 

−  It is [http://ideone.com/p0e81 very fast], running at speeds and empirical complexities comparable with the code from Melissa O'Neill's article (about <math>O(n^{1.2})</math> in number of primes ''n'' produced). 
+  <code>_Y</code> is a nonsharing fixpoint combinator for a recursive ''"telescoping"'' multistage primes production. 
−  As an aside, <code>joinT</code> is equivalent to calling [[Fold#Treelike_foldsinfinite treelike folding]] <code>foldi (\(x:xs) ys>x:union xs ys) []</code>. 
+  This allows the <code>primesLME</code> stream to be discarded immediately as it is being consumed by its consumer. With <code>primes'</code> from <code>primesLME1</code> definition above it is impossible, as each produced element of <code>primes'</code> is needed later as input to the same <code>primes'</code> corecursive stream definition. So the <code>primes'</code> stream feeds in a loop into itself, and thus is retained in memory. With multistage production, each stage feeds into its consumer up at the square of its current element which can be immediately discarded after it's been consumed. <code>(3:)</code> jumpstarts the whole thing. 
−  The above can be [http://ideone.com/qpnqe rewritten] as follows, using a standard (corecursive) definition for <code>fix</code> combinator from <code>Data.Function</code> to arrange for double primes feed automatically: 

−  <haskell>import Data.Function (fix) 
+  === Tree merging === 
+  Moreover, it can be changed into a '''''tree''''' structure. This idea [http://www.haskell.org/pipermail/haskellcafe/2007July/029391.html is due to Dave Bayer] and [[Prime_numbers_miscellaneous#Implicit_HeapHeinrich Apfelmus]]. Twostages production of primes [http://hackage.haskell.org/packages/archive/NumberSieves/0.0/doc/html/src/NumberTheorySieveONeill.html due to M. O'Neill]: 

−  primesTMEf = 2 : g (fix g) 
+  <haskell> 
−  where 
+  primesTME = 2 : _Y ((3:) . gaps 5 . joinT . map (\p> [p*p, p*p+2*p..])) 
−  g xs = 3 : gaps 5 (joinT [[x*x, x*x+2*x..]  x < xs]) 

−   fix g = xs where xs = g xs 
+  joinT ((x:xs):t) = x : (union xs . joinT . pairs) t  ~= nub.sort.concat 
+  where pairs (xs:ys:t) = union xs ys : pairs t 

+  
+  gaps k s@(x:xs)  k<x = k:gaps (k+2) s  ~= [k,k+2..]\\s, when 

+   True = gaps (k+2) xs  k<=x && null(s\\[k,k+2..]) 

</haskell> 
</haskell> 

+  
+  This code is [http://ideone.com/p0e81 pretty fast], running at speeds and empirical complexities comparable with the code from Melissa O'Neill's article (about <math>O(n^{1.2})</math> in number of primes ''n'' produced). 

+  
+  As an aside, <code>joinT</code> is equivalent to [[Fold#Treelike_foldsinfinite treelike folding]] <code>foldi (\(x:xs) ys> x:union xs ys) []</code>: 

+  
+  [[Image:Treelike_folding.gifframelesscenter458pxtreelike folding]] 

=== Tree merging with Wheel === 
=== Tree merging with Wheel === 

−  Wheel factorization optimization can be further applied, and another tree structure can be used which is better adjusted for the primes multiples production (effecting about 5%10% at the top of a total ''2.5x speedup'' w.r.t. the above tree merging on odds only  though complexity stays roughly the same): 
+  Wheel factorization optimization can be further applied, and another tree structure can be used which is better adjusted for the primes multiples production (effecting about 5%10% at the top of a total ''2.5x speedup'' w.r.t. the above tree merging on odds only, for first few million primes): 
<haskell> 
<haskell> 

−  {# OPTIONS_GHC O2 fnocse #} 
+  primesTMWE = [2,3,5,7] ++ _Y ((11:) . tail . gapsW 11 wheel 
−  primesTMWE = 2:3:5:7: gapsW 11 wheel (joinT3 $ rollW 11 wheel primes') 
+  . joinT . hitsW 11 wheel) 
−  where 

−  primes' = 11: gapsW 13 (tail wheel) (joinT3 $ rollW 11 wheel primes') 

−  gapsW k ws@(w:t) cs@(c:u)  k==c = gapsW (k+w) t u 
+  gapsW k (d:w) s@(c:cs)  k < c = k : gapsW (k+d) w s 
−   True = k : gapsW (k+w) t cs 
+   otherwise = gapsW (k+d) w cs  k==c 
−  rollW k ws@(w:t) ps@(p:u)  k==p = scanl (\c d>c+p*d) (p*p) ws 
+  hitsW k (d:w) s@(p:ps)  k < p = hitsW (k+d) w s 
−  : rollW (k+w) t u 
+   otherwise = scanl (\c d>c+p*d) (p*p) (d:w) 
−   True = rollW (k+w) t ps 
+  : hitsW (k+d) w ps  k==p 
−  joinT3 ((x:xs): ~(ys:zs:t)) = x : union xs (union ys zs) 

−  `union` joinT3 (pairs t) 

wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2: 
wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2: 

4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel 
4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel 

</haskell> 
</haskell> 

−  The compiler switch <code>fnocse</code> is used here to prevent space leak. 

====Above Limit  Offset Sieve==== 
====Above Limit  Offset Sieve==== 

−  Another task is to produce primes above a given value (i.e. not by their ordinal numbers): 
+  Another task is to produce primes above a given value: 
<haskell> 
<haskell> 

{# OPTIONS_GHC O2 fnocse #} 
{# OPTIONS_GHC O2 fnocse #} 

Line 252:  Line 258:  
(h,p':t) = span (< z) $ drop 4 primes  p < z => p*p<=a 
(h,p':t) = span (< z) $ drop 4 primes  p < z => p*p<=a 

z = ceiling $ sqrt $ fromIntegral a + 1  p'>=z => p'*p'>a 
z = ceiling $ sqrt $ fromIntegral a + 1  p'>=z => p'*p'>a 

−  compositesFrom a = joinT (joinST [multsOf p a  p < h++[p']] 
+  compositesFrom a = joinT (joinST [multsOf p a  p < h ++ [p']] 
−  : [multsOf p (p*p)  p < t]) 
+  : [multsOf p (p*p)  p < t] ) 
−  snapUp v o step = v + ((ov) `mod` step)  full steps from o 
+  snapUp v o step = v + (mod (ov) step)  full steps from o 
multsOf p from = scanl (\c d>c+p*d) (p*x) wh  map (p*) $ 
multsOf p from = scanl (\c d>c+p*d) (p*x) wh  map (p*) $ 

where  scanl (+) x wh 
where  scanl (+) x wh 

Line 276:  Line 282:  
<haskell> 
<haskell> 

−  {# OPTIONS_GHC O2 fnocse #} 

−  
import Data.List  based on http://stackoverflow.com/a/1140100 
import Data.List  based on http://stackoverflow.com/a/1140100 

import qualified Data.Map as M 
import qualified Data.Map as M 

Line 284:  Line 288:  
primesMPE = 2:mkPrimes 3 M.empty prs 9  postponed addition of primes into map; 
primesMPE = 2:mkPrimes 3 M.empty prs 9  postponed addition of primes into map; 

where  decoupled primes loop feed 
where  decoupled primes loop feed 

−  prs = mkPrimes 3 M.empty prs 9 
+  prs = 3:mkPrimes 5 M.empty prs 9 
mkPrimes n m ps@ ~(p:t) q = case (M.null m, M.findMin m) of 
mkPrimes n m ps@ ~(p:t) q = case (M.null m, M.findMin m) of 

(False, (n', skips))  n == n' > 
(False, (n', skips))  n == n' > 

mkPrimes (n+2) (addSkips n (M.deleteMin m) skips) ps q 
mkPrimes (n+2) (addSkips n (M.deleteMin m) skips) ps q 

_ > if n<q 
_ > if n<q 

−  then n : mkPrimes (n+2) m ps q 
+  then n : mkPrimes (n+2) m ps q 
else mkPrimes (n+2) (addSkip n m (2*p)) t (head t^2) 
else mkPrimes (n+2) (addSkip n m (2*p)) t (head t^2) 

Line 302:  Line 306:  
<haskell> 
<haskell> 

 unbounded sieve, premature filters 
 unbounded sieve, premature filters 

−  primesT = 2 : sieve [3,5..] where 
+  primesT = sieve [2..] 
+  where 

sieve (p:xs) = p : sieve [x  x < xs, rem x p /= 0] 
sieve (p:xs) = p : sieve [x  x < xs, rem x p /= 0] 

−   filter ((/=0).(`rem`p)) xs 
+   filter ((/=0).(`rem`p)) xs 
</haskell> 
</haskell> 

Line 310:  Line 314:  
=== Guarded Filters === 
=== Guarded Filters === 

−  But this really ought to be changed into bounded and guarded variant, again achieving the ''"miraculous"'' complexity improvement from above quadratic to about <math>O(n^{1.45})</math> empirically (in ''n'' primes produced): 
+  But this really ought to be changed into bounded and guarded variant, [[#From Squaresagain achieving]] the ''"miraculous"'' complexity improvement from above quadratic to about <math>O(n^{1.45})</math> empirically (in ''n'' primes produced): 
<haskell> 
<haskell> 

−  primesToGT m = 2 : sieve [3,5..m] where 
+  primesToGT m = 2 : sieve [3,5..m] 
+  where 

sieve (p:xs) 
sieve (p:xs) 

−   p*p > m = p : xs 
+   p*p > m = p : xs 
−   True = p : sieve [x  x < xs, rem x p /= 0] 
+   True = p : sieve [x  x < xs, rem x p /= 0] 
</haskell> 
</haskell> 

=== Postponed Filters === 
=== Postponed Filters === 

−  or as unbounded, postponed filterscreation variant: 
+  Or it can remain unbounded, just filters creation must be ''postponed'': 
+  <haskell> 

+  primesPT1 = 2 : sieve primesPT1 [3..] 

+  where 

+  sieve (p:ps) xs = let (h,t) = span (< p*p) xs 

+  in h ++ sieve ps [x  x<t, rem x p /= 0] 

+  </haskell> 

+  This is better rewritten with <code>span</code> and <code>(++)</code> inlined and fused into the <code>sieve</code>: 

<haskell> 
<haskell> 

primesPT = 2 : primes' 
primesPT = 2 : primes' 

Line 326:  Line 330:  
primes' = sieve [3,5..] 9 primes' 
primes' = sieve [3,5..] 9 primes' 

sieve (x:xs) q ps@ ~(p:t) 
sieve (x:xs) q ps@ ~(p:t) 

−   x < q = x : sieve xs q ps 
+   x < q = x : sieve xs q ps 
−   True = sieve [x  x < xs, rem x p /= 0] (head t^2) t 
+   True = sieve [x  x < xs, rem x p /= 0] (head t^2) t 
</haskell> 
</haskell> 

−  creating here [[#Linear merging as well]] the linear nested structure at runtime, <code>(...(([3,5..] >>= filterBy [3]) >>= filterBy [5])...)</code>, where <code>filterBy ds n = [n  noDivs ds n]</code> (see <code>noDivs</code> definition below)  –  but unlike the original code, each filter being created at its proper moment, not sooner than the prime's square is seen. 
+  creating here [[#Linear merging as well]] the linear nested structure at runtime, <code>(...(([3,5..] >>= filterBy [3]) >>= filterBy [5])...)</code>, where <code>filterBy ds n = [n  noDivs n ds]</code> (see <code>noDivs</code> definition below)  –  but unlike the original code, each filter being created at its proper moment, not sooner than the prime's square is seen. 
=== Optimal trial division === 
=== Optimal trial division === 

Line 335:  Line 339:  
The above is equivalent to the traditional formulation of trial division, 
The above is equivalent to the traditional formulation of trial division, 

<haskell> 
<haskell> 

−  noDivs factors n = foldr (\f r > f*f > n  (rem n f /= 0 && r)) 
+  ps = 2 : [i  i < [3..], 
−  True factors 
+  and [rem i p > 0  p < takeWhile ((<=i).(^2)) ps]] 
−   primes = filter (noDivs [2..]) [2..] 
+  </haskell> 
−  primesTD = 2 : 3 : filter (noDivs $ tail primesTD) [5,7..] 
+  or, 
−  isPrime n = n > 1 && noDivs primesTD n 
+  <haskell> 
+  noDivs n = foldr (\f r > f*f > n  (rem n f /= 0 && r)) True 

+   primes = filter (`noDivs`[2..]) [2..] 

+  primesTD = 2 : 3 : filter (`noDivs` tail primesTD) [5,7..] 

+  isPrime n = n > 1 && noDivs n primesTD 

</haskell> 
</haskell> 

except that this code is rechecking for each candidate number which primes to use, whereas for every candidate number in each segment between the successive squares of primes these will just be the same prefix of the primes list being built. 
except that this code is rechecking for each candidate number which primes to use, whereas for every candidate number in each segment between the successive squares of primes these will just be the same prefix of the primes list being built. 

=== Segmented Generate and Test === 
=== Segmented Generate and Test === 

−  This ''primes prefix's length'' can be explicitly maintained, achieving a certain further speedup (though not in complexity which stays the same) by turning a list of filters into one filter by an explicit list of primes: 
+  Next we turn a list of filters into one filter by an explicit list, each one in a progression of prefixes of the primes list. This seems to eliminate most recalculations, explicitly filtering composites out from batches of odds between the consecutive squares of primes. 
<haskell> 
<haskell> 

−  primesST = 2 : primes' 
+  import Data.List 
+  
+  primesST = 2 : oddprimes 

where 
where 

−  primes' = sieve 3 9 primes' 0 
+  oddprimes = sieve 3 9 oddprimes (inits oddprimes)  [],[3],[3,5],... 
−  sieve x q ~(_:t) k = let fs = take k primes' in 
+  sieve x q ~(_:t) (fs:ft) = 
filter ((`all` fs) . ((/=0).) . rem) [x,x+2..q2] 
filter ((`all` fs) . ((/=0).) . rem) [x,x+2..q2] 

−  ++ sieve (q+2) (head t^2) t (k+1) 
+  ++ sieve (q+2) (head t^2) t ft 
</haskell> 
</haskell> 

−  This seems to eliminate most recalculations, explicitly filtering composites out from batches of odds between the consecutive squares of primes. 
+  
==== Generate and Test Above Limit ==== 
==== Generate and Test Above Limit ==== 

−  The following will start the segmented Turner sieve at the right place, using any primes list it's supplied with (e.g. [[Prime_numbers#Tree_merging_with_Wheel  TMWE]] etc.), demand computing it up to the square root of any prime it'll produce: 
+  The following will start the segmented Turner sieve at the right place, using any primes list it's supplied with (e.g. [[Prime_numbers#Tree_merging_with_Wheel  TMWE]] etc.) or itself, as shown, demand computing it just up to the square root of any prime it'll produce: 
<haskell> 
<haskell> 

−  primesFromST primes m 
+  primesFromST m  m > 2 = 
−   m>2 = sieve (m`div`2*2+1) (head ps^2) (tail ps) (length h1) 
+  sieve (m`div`2*2+1) (head ps^2) (tail ps) (inits ps) 
where 
where 

−  (h,ps) = span (<= (floor.sqrt $ fromIntegral m+1)) primes 
+  (h,ps) = span (<= (floor.sqrt $ fromIntegral m+1)) oddprimes 
−  sieve x q ps k = let fs = take k $ tail primes in 
+  sieve x q ps (fs:ft) = 
−  filter ((`all` fs) . ((/=0).) . rem) [x,x+2..q2] 
+  filter ((`all` (h ++ fs)) . ((/=0).) . rem) [x,x+2..q2] 
−  ++ sieve (q+2) (head ps^2) (tail ps) (k+1) 
+  ++ sieve (q+2) (head ps^2) (tail ps) ft 
+  oddprimes = 3 : primesFromST 5  odd primes 

</haskell> 
</haskell> 

Line 372:  Line 376:  
=== Conclusions === 
=== Conclusions === 

−  All these variants of course being variations of trial division – finding out primes by direct divisibility testing of every odd number by sequential primes below its square root (instead of just by ''its factors'', which is what ''direct generation of multiples'' is doing, essentially) – are thus of principally worse complexities than that of Sieve of Eratosthenes. 
+  All these variants being variations of trial division, finding out primes by direct divisibility testing of every candidate number by sequential primes below its square root (instead of just by ''its factors'', which is what ''direct generation of multiples'' is doing, essentially), are thus principally of worse complexity than that of Sieve of Eratosthenes. 
−  
−  The prototype code is very inefficient, yet improving significantly with just a simple use of bounded, guarded formulation. So, there's no harm in applying simple improvements to a code instead of just labeling it "naive". BTW were divisibility testing somehow turned into an <math>O(1)</math> operation, e.g. by some kind of massive parallelization, the overall complexity of trial division sieve would drop to <math>O(n)</math>. It's the sequentiality of testing that's the culprit. 

−  
−  Did Eratosthenes himself achieve the optimal complexity? It rather seems doubtful, as he probably counted boxes in the table by ''1'' to go from one number to the next, as in '''3''',''5'',''7'','''9''',''11'',''13'','''15''', ... for he had no access even to Hindu numerals, using Greek alphabet for writing down numbers instead. Was he performing a ''genuine'' sieve of Eratosthenes then? Should faithfulness of an algorithm's implementation be judged by its performance? 

−  So the initial Turner code is just a oneliner that ought to have been regarded as ''executable specification'' in the first place. The reason it was taught that way is probably so that it could provoke discussion among students, at the very least discovering the optimization technique of the postponement of filtercreation. 
+  The initial code is just a oneliner that ought to have been regarded as ''executable specification'' in the first place. It can easily be improved quite significantly with a simple use of bounded, guarded formulation to limit the number of filters it creates, or by postponement of filter creation. BTW were divisibility testing somehow turned into an <math>O(1)</math> operation, e.g. by some kind of massive parallelization, the overall complexity of trial division sieve would drop to just <math>O(n)</math>. 
== Euler's Sieve == 
== Euler's Sieve == 

Line 390:  Line 394:  
In the streambased sieve of Eratosthenes we are able to ''skip'' along the input stream <code>xs</code> directly to the prime's square, consuming the whole prefix at once, thus achieving the results equivalent to the postponement technique, because the generation of the prime's multiples is independent of the rest of the stream. 
In the streambased sieve of Eratosthenes we are able to ''skip'' along the input stream <code>xs</code> directly to the prime's square, consuming the whole prefix at once, thus achieving the results equivalent to the postponement technique, because the generation of the prime's multiples is independent of the rest of the stream. 

−  But here in the Euler's sieve it ''is'' dependent on all <code>xs</code> and we're unable ''in principle'' to skip along it to the prime's square  because all <code>xs</code> are needed for each prime's multiples generation. Thus efficient unbounded streambased implementation seems to be impossible, in principle. 
+  But here in the Euler's sieve it ''is'' dependent on all <code>xs</code> and we're unable ''in principle'' to skip along it to the prime's square  because all <code>xs</code> are needed for each prime's multiples generation. Thus efficient unbounded streambased implementation seems to be impossible in principle, under the simple scheme of producing the multiples by multiplication. 
=== Wheeled list representation === 
=== Wheeled list representation === 

Line 442:  Line 446:  
=== Generating Segments of Primes === 
=== Generating Segments of Primes === 

−  The sieve of Eratosthenes'es [[#Segmentedremoval of multiples on each segment of odds]] can be done by actually marking them in an array instead of manipulating the ordered lists, and can be further sped up more than twice by working with odds only, represented as their offsets in segment arrays: 
+  The sieve of Eratosthenes' [[#Segmentedremoval of multiples on each segment of odds]] can be done by actually marking them in an array, instead of manipulating ordered lists, and can be further sped up more than twice by working with odds only: 
<haskell> 
<haskell> 

−  import Data.Array 
+  import Data.Array.Unboxed 
−   import Data.Array.Unboxed 

+  primesSA :: [Int] 

primesSA = 2 : prs 
primesSA = 2 : prs 

where 
where 

prs = 3 : sieve prs 3 [] 
prs = 3 : sieve prs 3 [] 

−  sieve (p:ps) x fs = [i*2 + x  (i,e) < assocs a, e] 
+  sieve (p:ps) x fs = [i*2 + x  (i,True) < assocs a] 
−  ++ sieve ps (p*p) fs' 
+  ++ sieve ps (p*p) ((p,0) : 
+  [(s, rem (yq) s)  (s,y) < fs]) 

where 
where 

−  q = (p*px)`div`2 
+  q = (p*px)`div`2 
−  fs' = (p,0) : [(s, rem (yq) s)  (s,y) < fs] 
+  a :: UArray Int Bool 
−   a :: UArray Int Bool 
+  a = accumArray (\ b c > False) True (1,q1) 
−  a = accumArray (\ b c > False) True (1,q1) 
+  [(i,())  (s,y) < fs, i < [y+s, y+s+s..q]] 
−  [(i,())  (s,y) < fs, i < [y+s, y+s+s..q]] 

</haskell> 
</haskell> 

−  Run on ''[http://ideone.com/eRP1x Ideone.com]'' it is somewhat faster than [[#Tree Merging With WheelTree Merging With Wheel]] in producing first million primes or two, but has worse time complexity and large memory footprint which quickly gets into hundreds of MBs. 
+  Runs significantly faster than [[#Tree merging with WheelTMWE]] and with better empirical complexity, of about <math>O(n^{1.10..1.05})</math> in producing first few millions of primes, with constant memory footprint. 
−  
−  When the (commented out above) explicit type signature is added, making the same code work on ''unboxed arrays'', the resulting GHCcompiled code runs more than twice faster and with better empirical complexity, of about <math>O(n^{1.10..1.05})</math> in producing first few millions of primes, with smaller though still growing memory footprint. Fixed sized segments are usually used in segmented sieves to limit the memory usage. 

=== Calculating Primes Upto a Given Value === 
=== Calculating Primes Upto a Given Value === 
Latest revision as of 09:10, 29 March 2014
In mathematics, amongst the natural numbers greater than 1, a prime number (or a prime) is such that has no divisors other than itself (and 1).
[edit] 1 Prime Number Resources
 At Wikipedia:
 HackageDB packages:
 arithmoi: Various basic number theoretic functions; efficient arraybased sieves, Montgomery curve factorization ...
 Numbers: An assortment of number theoretic functions.
 NumberSieves: Number Theoretic Sieves: primes, factorization, and Euler's Totient.
 primes: Efficient, purely functional generation of prime numbers.
 Papers:
 O'Neill, Melissa E., "The Genuine Sieve of Eratosthenes", Journal of Functional Programming, Published online by Cambridge University Press 9 October 2008 doi:10.1017/S0956796808007004.
[edit] 2 Definition
In mathematics, amongst the natural numbers greater than 1, a prime number (or a prime) is such that has no divisors other than itself (and 1). The smallest prime is thus 2. Nonprime numbers are known as composite, i.e. those representable as product of two natural numbers greater than 1. The set of prime numbers is thus
 P = {n ∈ N_{2} : (∀ m ∈ N_{2}) ((m  n) ⇒ m = n)}
 = {n ∈ N_{2} : (∀ m ∈ N_{2}) (m×m ≤ n ⇒ ¬(m  n))}
 = N_{2} \ {n×m : n,m ∈ N_{2}}
 = N_{2} \ ⋃ { {n×m : m ∈ N_{n}} : n ∈ N_{2} }
 = N_{2} \ ⋃ { {n×n, n×n+n, n×n+2×n, ...} : n ∈ N_{2} }
 = N_{2} \ ⋃ { {n×n, n×n+n, n×n+2×n, ...} : n ∈ P }
 where N_{k} = {n ∈ N : n ≥ k}
 = N_{2} \ ⋃ { {n×n, n×n+n, n×n+2×n, ...} : n ∈ P }
Thus starting with 2, for each newly found prime we can eliminate from the rest of the numbers all the multiples of this prime, giving us the next available number as next prime. This is known as sieving the natural numbers, so that in the end all the composites are eliminated and what we are left with are just primes. (This is what the last formula is describing, though seemingly impredicative, because it is selfreferential. But because N_{2} is wellordered (with the order being preserved under addition), the formula is welldefined.)
To eliminate a prime's multiples we can either a. test each new candidate number for divisibility by that prime, giving rise to a kind of trial division algorithm; or b. we can directly generate the multiples of a prime p
by counting up from it in increments of p
, resulting in a variant of the sieve of Eratosthenes.
Having a directaccess mutable arrays indeed enables easy marking off of these multiples on preallocated array as it is usually done in imperative languages; but to get an efficient listbased code we have to be smart about combining those streams of multiples of each prime  which gives us also the memory efficiency in generating the results incrementally, one by one.
[edit] 3 Sieve of Eratosthenes
The sieve of Eratosthenes calculates primes as integers above 1 that are not multiples of primes, i.e. not composite — whereas composites are found as a union of sequences of multiples of each prime, generated by counting up from the prime's square in constant increments equal to that prime (or twice that much, for odd primes):
primes = 2 : 3 : ([5,7..] `minus` _U [[p*p, p*p+2*p..]  p < tail primes])  Using `under n = takeWhile (< n)`, `minus` and `_U` satisfy  under n (minus a b) == under n a \\ under n b  under n (union a b) == nub . sort $ under n a ++ under n b  under n . _U == nub . sort . concat . map (under n)
The definition is primed with 2 and 3 as initial primes, to avoid the vicious circle.
_U
can be defined as a folding of (\(x:xs) > (x:) . union xs)
, or it can use an Bool
array as a sorting and duplicatesremoving device. The processing naturally divides up into segments between the successive squares of primes.
Stepwise development follows (the fully developed version is here).
[edit] 3.1 Initial definition
To start with, the sieve of Eratosthenes can be genuinely represented by
 genuine yet wasteful sieve of Eratosthenes primesTo m = eratos [2..m] where eratos [] = [] eratos (p:xs) = p : eratos (xs `minus` [p, p+p..m])  eratos (p:xs) = p : eratos (xs `minus` map (p*) [1..m])  eulers (p:xs) = p : eulers (xs `minus` map (p*) (p:xs))  turner (p:xs) = p : turner [x  x < xs, rem x p /= 0]
This should be regarded more like a specification, not a code. It runs at empirical orders of growth worse than quadratic in number of primes produced. But it has the core defining features of the classical formulation of S. of E. as a. being bounded, i.e. having a top limit value, and b. finding out the multiples of a prime directly, by counting up from it in constant increments, equal to that prime.
The canonical listdifference minus
and duplicatesremoving union
functions dealing with ordered increasing lists (cf. Data.List.Ordered package) are:
 ordered lists, difference and union minus (x:xs) (y:ys) = case (compare x y) of LT > x : minus xs (y:ys) EQ > minus xs ys GT > minus (x:xs) ys minus xs _ = xs union (x:xs) (y:ys) = case (compare x y) of LT > x : union xs (y:ys) EQ > x : union xs ys GT > y : union (x:xs) ys union xs [] = xs union [] ys = ys
The name merge ought to be reserved for duplicatespreserving merging operation of mergesort.
[edit] 3.2 Analysis
So for each newly found prime the sieve discards its multiples, enumerating them by counting up in steps of p. There are thus O(m / p) multiples generated and eliminated for each prime, and O(mloglog(m)) multiples in total, with duplicates, by virtues of prime harmonic series.
If each multiple is dealt with in O(1) time, this will translate into O(mloglog(m)) RAM machine operations (since we consider addition as an atomic operation). Indeed mutable randomaccess arrays allow for that. But lists in Haskell are sequentialaccess, and complexity of minus(a,b)
for lists is instead of of the direct access destructive array update. The lower the complexity of each minus step, the better the overall complexity.
So on kth step the argument list (p:xs)
that the eratos
function gets starts with the (k+1)th prime, and consists of all the numbers ≤ m coprime with all the primes ≤ p. According to the M. O'Neill's article (p.10) there are such numbers.
It looks like for our intents and purposes. Since the number of primes below m is n = π(m) = O(m / log(m)) by the prime number theorem (where π(m) is a prime counting function), there will be n multiplesremoving steps in the algorithm; it means total complexity of at least O(mn / log(n)) = O(m^{2} / (log(m))^{2}), or O(n^{2}) in n primes produced  much much worse than the optimal O(nlog(n)loglog(n)).
[edit] 3.3 From Squares
But we can start each elimination step at a prime's square, as its smaller multiples will have been already produced and discarded on previous steps, as multiples of smaller primes. This means we can stop early now, when the prime's square reaches the top value m, and thus cut the total number of steps to around . This does not in fact change the complexity of randomaccess code, but for lists it makes it O(m^{1.5} / (logm)^{2}), or O(n^{1.5} / (logn)^{0.5}) in n primes produced, a dramatic speedup:
primesToQ m = eratos [2..m] where eratos [] = [] eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+p..m])  eratos (p:xs) = p : eratos (xs `minus` map (p*) [p..div m p])  eulers (p:xs) = p : eulers (xs `minus` map (p*) (takeWhile (<= div m p) (p:xs)))  turner (p:xs) = p : turner [x  x<xs, x<p*p  rem x p /= 0]
Its empirical complexity is around O(n^{1.45}). This simple optimization works here because this formulation is bounded (by an upper limit). To start late on a bounded sequence is to stop early (starting past end makes an empty sequence – see warning below^{1}), thus preventing the creation of all the superfluous multiples streams which start above the upper bound anyway (note that Turner's sieve is unaffected by this). This is acceptably slow now, striking a good balance between clarity, succinctness and efficiency.
^{1}Warning: this is predicated on a subtle point of minus xs [] = xs
definition being used, as it indeed should be. If the definition minus (x:xs) [] = x:minus xs []
is used, the problem is back and the complexity is bad again.
[edit] 3.4 Guarded
This ought to be explicated (improving on clarity, though not on time complexity) as in the following, for which it is indeed a minor optimization whether to start from p or p*p  because it explicitly stops as soon as possible:
primesToG m = 2 : sieve [3,5..m] where sieve (p:xs)  p*p > m = p : xs  otherwise = p : sieve (xs `minus` [p*p, p*p+2*p..])  p : sieve (xs `minus` map (p*) [p,p+2..])  p : eulers (xs `minus` map (p*) (p:xs))
(here we also dismiss all evens above 2 a priori.) It is now clear that it can't be made unbounded just by abolishing the upper bound m, because the guard can not be simply omitted without changing the complexity back for the worst.
[edit] 3.5 Accumulating Array
So while minus(a,b)
takes O(  b  ) operations for randomaccess imperative arrays and about O(  a  ) operations here for ordered increasing lists of numbers, using Haskell's immutable array for a one could expect the array update time to be nevertheless closer to O(  b  ) if destructive update were used implicitly by compiler for an array being passed along as an accumulating parameter:
{# OPTIONS_GHC O2 #} import Data.Array.Unboxed primesToA m = sieve 3 (array (3,m) [(i,odd i)  i<[3..m]] :: UArray Int Bool) where sieve p a  p*p > m = 2 : [i  (i,True) < assocs a]  a!p = sieve (p+2) $ a//[(i,False)  i < [p*p, p*p+2*p..m]]  otherwise = sieve (p+2) a
Indeed for unboxed arrays, with the type signature added explicitly (suggested by Daniel Fischer), the above code runs pretty fast, with empirical complexity of about O(n^{1.15..1.45}) in n primes produced (for producing from few hundred thousands to few millions primes, memory usage also slowly growing). But it relies on specific compiler optimizations, and indeed if we remove the explicit type signature, the code above turns very slow.
How can we write code that we'd be certain about? One way is to use explicitly mutable monadic arrays (see below), but we can also think about it a little bit more on the functional side of things still.
[edit] 3.6 Postponed
Going back to guarded Eratosthenes, first we notice that though it works with minimal number of prime multiples streams, it still starts working with each prematurely. Fixing this with explicit synchronization won't change complexity but will speed it up some more:
primesPE1 = 2 : sieve [3..] primesPE1 where sieve xs (p:ps)  q < p*p , (h,t) < span (< q) xs = h ++ sieve (t `minus` [q, q+p..]) ps  h ++ turner [xx<t, rem x p /= 0] ps
Inlining and fusing span
and (++)
we get:
primesPE = 2 : primes' where primes' = sieve [3,5..] 9 primes' sieve (x:xs) q ps@ ~(p:t)  x < q = x : sieve xs q ps  otherwise = sieve (xs `minus` [q, q+2*p..]) (head t^2) t
Since the removal of a prime's multiples here starts at the right moment, and not just from the right place, the code can now finally be made unbounded. Because no multiplesremoval process is started prematurely, there are no extraneous multiples streams, which were the reason for the original formulation's extreme inefficiency.
[edit] 3.7 Segmented
With work done segmentwise between the successive squares of primes it becomes
primesSE = 2 : primes' where primes' = sieve 3 9 primes' [] sieve x q ~(p:t) fs = foldr (flip minus) [x,x+2..q2] [[y+s, y+2*s..q]  (s,y) < fs] ++ sieve (q+2) (head t^2) t ((2*p,q):[(s,qrem (qy) s)  (s,y) < fs])
This "marks" the odd composites in a given range by generating them  just as a person performing the original sieve of Eratosthenes would do, counting one by one the multiples of the relevant primes. These composites are independently generated so some will be generated multiple times.
The advantage to working in spans explicitly is that this code is easily amendable to using arrays for the composites marking and removal on each finite span; and memory usage can be kept in check by using fixed sized segments.
[edit] 3.7.1 Segmented Treemerging
Rearranging the chain of subtractions into a subtraction of merged streams (see below) and using treelike folding structure, further speeds up the code and significantly improves its asymptotic time behavior (down to about O(n^{1.28}empirically), space is leaking though):
primesSTE = 2 : primes' where primes' = sieve 3 9 primes' [] sieve x q ~(p:t) fs = ([x,x+2..q2] `minus` joinST [[y+s, y+2*s..q]  (s,y) < fs]) ++ sieve (q+2) (head t^2) t ((++ [(2*p,q)]) [(s,qrem (qy) s)  (s,y) < fs]) joinST (xs:t) = (union xs . joinST . pairs) t where pairs (xs:ys:t) = union xs ys : pairs t pairs t = t joinST [] = []
[edit] 3.8 Linear merging
But segmentation doesn't add anything substantially, and each multiples stream starts at its prime's square anyway. What does the Postponed code do, operationally? For each prime's square passed over, it builds up a nested linear leftdeepening structure, (...((xsa)b)...), where xs is the original oddsproducing [3,5..] list, so that each odd it produces must go through more and more `minus`
nodes on its way up  and those odd numbers that eventually emerge on top are prime. Thinking a bit about it, wouldn't another, rightdeepening structure, (xs(a+(b+...))), be better? This idea is due to Richard Bird, seen in his code presented in M. O'Neill's article, equivalent to:
primesB = (2:) . minus [3..] . foldr (\p r> p*p : union [p*p+p, p*p+2*p..] r) [] $ primesB
or,
primesLME1 = 2 : primes' where primes' = 3 : ([5,7..] `minus` joinL [[p*p, p*p+2*p..]  p < primes']) joinL ((x:xs):t) = x : union xs (joinL t)
Here, xs stays near the top, and more frequently oddsproducing streams of multiples of smaller primes are above those of the bigger primes, that produce less frequently their candidates which have to pass through more `union`
nodes on their way up. Plus, no explicit synchronization is necessary anymore because the produced multiples of a prime start at its square anyway  just some care has to be taken to avoid a runaway access to the indefinitelydefined structure (specifically, if each (+)/union
operation passes along unconditionally its left child's head value before polling the right child's head value, then we are safe).
To prevent unneeded memoization (a memory leak), Melissa O'Neill introduces double primes feed in her ZIP file's code. We can even do multistage. Here's the code, faster still and with radically reduced memory consumption, with empirical orders of growth of about ~ n^{1.25..1.40}:
primesLME = 2 : _Y ((3:) . minus [5,7..] . joinL . map (\p> [p*p, p*p+2*p..])) _Y :: (t > t) > t _Y g = g (_Y g)  multistage, recursive  g x where x = g x  two stages
_Y
is a nonsharing fixpoint combinator for a recursive "telescoping" multistage primes production.
This allows the primesLME
stream to be discarded immediately as it is being consumed by its consumer. With primes'
from primesLME1
definition above it is impossible, as each produced element of primes'
is needed later as input to the same primes'
corecursive stream definition. So the primes'
stream feeds in a loop into itself, and thus is retained in memory. With multistage production, each stage feeds into its consumer up at the square of its current element which can be immediately discarded after it's been consumed. (3:)
jumpstarts the whole thing.
[edit] 3.9 Tree merging
Moreover, it can be changed into a tree structure. This idea is due to Dave Bayer and Heinrich Apfelmus. Twostages production of primes due to M. O'Neill:
primesTME = 2 : _Y ((3:) . gaps 5 . joinT . map (\p> [p*p, p*p+2*p..])) joinT ((x:xs):t) = x : (union xs . joinT . pairs) t  ~= nub.sort.concat where pairs (xs:ys:t) = union xs ys : pairs t gaps k s@(x:xs)  k<x = k:gaps (k+2) s  ~= [k,k+2..]\\s, when  True = gaps (k+2) xs  k<=x && null(s\\[k,k+2..])
This code is pretty fast, running at speeds and empirical complexities comparable with the code from Melissa O'Neill's article (about O(n^{1.2}) in number of primes n produced).
As an aside, joinT
is equivalent to infinite treelike folding foldi (\(x:xs) ys> x:union xs ys) []
:
[edit] 3.10 Tree merging with Wheel
Wheel factorization optimization can be further applied, and another tree structure can be used which is better adjusted for the primes multiples production (effecting about 5%10% at the top of a total 2.5x speedup w.r.t. the above tree merging on odds only, for first few million primes):
primesTMWE = [2,3,5,7] ++ _Y ((11:) . tail . gapsW 11 wheel . joinT . hitsW 11 wheel) gapsW k (d:w) s@(c:cs)  k < c = k : gapsW (k+d) w s  otherwise = gapsW (k+d) w cs  k==c hitsW k (d:w) s@(p:ps)  k < p = hitsW (k+d) w s  otherwise = scanl (\c d>c+p*d) (p*p) (d:w) : hitsW (k+d) w ps  k==p wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2: 4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel
[edit] 3.10.1 Above Limit  Offset Sieve
Another task is to produce primes above a given value:
{# OPTIONS_GHC O2 fnocse #} primesFromTMWE primes m = dropWhile (< m) [2,3,5,7,11] ++ gapsW a wh' (compositesFrom a) where (a,wh') = rollFrom (snapUp (max 3 m) 3 2) (h,p':t) = span (< z) $ drop 4 primes  p < z => p*p<=a z = ceiling $ sqrt $ fromIntegral a + 1  p'>=z => p'*p'>a compositesFrom a = joinT (joinST [multsOf p a  p < h ++ [p']] : [multsOf p (p*p)  p < t] ) snapUp v o step = v + (mod (ov) step)  full steps from o multsOf p from = scanl (\c d>c+p*d) (p*x) wh  map (p*) $ where  scanl (+) x wh (x,wh) = rollFrom (snapUp from p (2*p) `div` p)  , if p < from wheelNums = scanl (+) 0 wheel rollFrom n = go wheelNums wheel where m = (n11) `mod` 210 go (x:xs) ws@(w:ws')  x < m = go xs ws'  True = (n+xm, ws)  (x >= m)
A certain preprocessing delay makes it worthwhile when producing more than just a few primes, otherwise it degenerates into simple trial division, which is then ought to be used directly:
primesFrom m = filter isPrime [m..]
[edit] 3.11 Mapbased
Runs ~1.7x slower than TME version, but with the same empirical time complexity, ~n^{1.2} (in n primes produced) and same very low (near constant) memory consumption:
import Data.List  based on http://stackoverflow.com/a/1140100 import qualified Data.Map as M primesMPE :: [Integer] primesMPE = 2:mkPrimes 3 M.empty prs 9  postponed addition of primes into map; where  decoupled primes loop feed prs = 3:mkPrimes 5 M.empty prs 9 mkPrimes n m ps@ ~(p:t) q = case (M.null m, M.findMin m) of (False, (n', skips))  n == n' > mkPrimes (n+2) (addSkips n (M.deleteMin m) skips) ps q _ > if n<q then n : mkPrimes (n+2) m ps q else mkPrimes (n+2) (addSkip n m (2*p)) t (head t^2) addSkip n m s = M.alter (Just . maybe [s] (s:)) (n+s) m addSkips = foldl' . addSkip
[edit] 4 Turner's sieve  Trial division
David Turner's original 1975 formulation (SASL Language Manual, 1975) replaces nonstandard `minus` in the sieve of Eratosthenes by stock list comprehension with rem filtering, turning it into a kind of trial division algorithm:
 unbounded sieve, premature filters primesT = sieve [2..] where sieve (p:xs) = p : sieve [x  x < xs, rem x p /= 0]  filter ((/=0).(`rem`p)) xs
This creates an immense number of superfluous implicit filters in extremely premature fashion. To be admitted as prime, each number will be tested for divisibility here by all its preceding primes potentially, while just those not greater than its square root would suffice. To find e.g. the 1001st prime (7927
), 1000 filters are used, when in fact just the first 24 are needed (up to 89
's filter only). Operational overhead is enormous here.
[edit] 4.1 Guarded Filters
But this really ought to be changed into bounded and guarded variant, again achieving the "miraculous" complexity improvement from above quadratic to about O(n^{1.45}) empirically (in n primes produced):
primesToGT m = 2 : sieve [3,5..m] where sieve (p:xs)  p*p > m = p : xs  True = p : sieve [x  x < xs, rem x p /= 0]
[edit] 4.2 Postponed Filters
Or it can remain unbounded, just filters creation must be postponed:
primesPT1 = 2 : sieve primesPT1 [3..] where sieve (p:ps) xs = let (h,t) = span (< p*p) xs in h ++ sieve ps [x  x<t, rem x p /= 0]
This is better rewritten with span
and (++)
inlined and fused into the sieve
:
primesPT = 2 : primes' where primes' = sieve [3,5..] 9 primes' sieve (x:xs) q ps@ ~(p:t)  x < q = x : sieve xs q ps  True = sieve [x  x < xs, rem x p /= 0] (head t^2) t
creating here as well the linear nested structure at runtime, (...(([3,5..] >>= filterBy [3]) >>= filterBy [5])...)
, where filterBy ds n = [n  noDivs n ds]
(see noDivs
definition below) – but unlike the original code, each filter being created at its proper moment, not sooner than the prime's square is seen.
[edit] 4.3 Optimal trial division
The above is equivalent to the traditional formulation of trial division,
ps = 2 : [i  i < [3..], and [rem i p > 0  p < takeWhile ((<=i).(^2)) ps]]
or,
noDivs n = foldr (\f r > f*f > n  (rem n f /= 0 && r)) True  primes = filter (`noDivs`[2..]) [2..] primesTD = 2 : 3 : filter (`noDivs` tail primesTD) [5,7..] isPrime n = n > 1 && noDivs n primesTD
except that this code is rechecking for each candidate number which primes to use, whereas for every candidate number in each segment between the successive squares of primes these will just be the same prefix of the primes list being built.
[edit] 4.4 Segmented Generate and Test
Next we turn a list of filters into one filter by an explicit list, each one in a progression of prefixes of the primes list. This seems to eliminate most recalculations, explicitly filtering composites out from batches of odds between the consecutive squares of primes.
import Data.List primesST = 2 : oddprimes where oddprimes = sieve 3 9 oddprimes (inits oddprimes)  [],[3],[3,5],... sieve x q ~(_:t) (fs:ft) = filter ((`all` fs) . ((/=0).) . rem) [x,x+2..q2] ++ sieve (q+2) (head t^2) t ft
[edit] 4.4.1 Generate and Test Above Limit
The following will start the segmented Turner sieve at the right place, using any primes list it's supplied with (e.g. TMWE etc.) or itself, as shown, demand computing it just up to the square root of any prime it'll produce:
primesFromST m  m > 2 = sieve (m`div`2*2+1) (head ps^2) (tail ps) (inits ps) where (h,ps) = span (<= (floor.sqrt $ fromIntegral m+1)) oddprimes sieve x q ps (fs:ft) = filter ((`all` (h ++ fs)) . ((/=0).) . rem) [x,x+2..q2] ++ sieve (q+2) (head ps^2) (tail ps) ft oddprimes = 3 : primesFromST 5  odd primes
This is usually faster than testing candidate numbers for divisibility one by one which has to refetch anew the needed prime factors to test by, for each candidate. Faster is the offset sieve of Eratosthenes on odds, and yet faster the above one, w/ wheel optimization.
[edit] 4.5 Conclusions
All these variants being variations of trial division, finding out primes by direct divisibility testing of every candidate number by sequential primes below its square root (instead of just by its factors, which is what direct generation of multiples is doing, essentially), are thus principally of worse complexity than that of Sieve of Eratosthenes.
The initial code is just a oneliner that ought to have been regarded as executable specification in the first place. It can easily be improved quite significantly with a simple use of bounded, guarded formulation to limit the number of filters it creates, or by postponement of filter creation. BTW were divisibility testing somehow turned into an O(1) operation, e.g. by some kind of massive parallelization, the overall complexity of trial division sieve would drop to just O(n).
[edit] 5 Euler's Sieve
[edit] 5.1 Unbounded Euler's sieve
With each found prime Euler's sieve removes all its multiples in advance so that at each step the list to process is guaranteed to have no multiples of any of the preceding primes in it (consists only of numbers coprime with all the preceding primes) and thus starts with the next prime:
primesEU = 2 : eulers [3,5..] where eulers (p:xs) = p : eulers (xs `minus` map (p*) (p:xs))  eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+2*p..])
This code is extremely inefficient, running above O(n^{2}) empirical complexity (and worsening rapidly), and should be regarded a specification only. Its memory usage is very high, with empirical space complexity just below O(n^{2}), in n primes produced.
In the streambased sieve of Eratosthenes we are able to skip along the input stream xs
directly to the prime's square, consuming the whole prefix at once, thus achieving the results equivalent to the postponement technique, because the generation of the prime's multiples is independent of the rest of the stream.
But here in the Euler's sieve it is dependent on all xs
and we're unable in principle to skip along it to the prime's square  because all xs
are needed for each prime's multiples generation. Thus efficient unbounded streambased implementation seems to be impossible in principle, under the simple scheme of producing the multiples by multiplication.
[edit] 5.2 Wheeled list representation
The situation can be somewhat improved using a different list representation, for generating lists not from a last element and an increment, but rather a last span and an increment, which entails a set of helpful equivalences:
{ fromElt (x,i) = x : fromElt (x + i,i) === iterate (+ i) x [n..] === fromElt (n,1) === fromSpan ([n],1) [n,n+2..] === fromElt (n,2) === fromSpan ([n,n+2],4) } fromSpan (xs,i) = xs ++ fromSpan (map (+ i) xs,i) { === concat $ iterate (map (+ i)) xs fromSpan (p:xt,i) === p : fromSpan (xt ++ [p + i], i) fromSpan (xs,i) `minus` fromSpan (ys,i) === fromSpan (xs `minus` ys, i) map (p*) (fromSpan (xs,i)) === fromSpan (map (p*) xs, p*i) fromSpan (xs,i) === forall (p > 0). fromSpan (concat $ take p $ iterate (map (+ i)) xs, p*i) } spanSpecs = iterate eulerStep ([2],1) eulerStep (xs@(p:_), i) = ( (tail . concat . take p . iterate (map (+ i))) xs `minus` map (p*) xs, p*i ) { > mapM_ print $ take 4 spanSpecs ([2],1) ([3],2) ([5,7],6) ([7,11,13,17,19,23,29,31],30) }
Generating a list from a span specification is like rolling a wheel as its pattern gets repeated over and over again. For each span specification w@((p:_),_)
produced by eulerStep
, the numbers in (fromSpan w)
up to p^{2} are all primes too, so that
eulerPrimesTo m = if m > 1 then go ([2],1) else [] where go w@((p:_), _)  m < p*p = takeWhile (<= m) (fromSpan w)  True = p : go (eulerStep w)
This runs at about O(n^{1.5..1.8}) complexity, for n
primes produced, and also suffers from a severe space leak problem (IOW its memory usage is also very high).
[edit] 6 Using Immutable Arrays
[edit] 6.1 Generating Segments of Primes
The sieve of Eratosthenes' removal of multiples on each segment of odds can be done by actually marking them in an array, instead of manipulating ordered lists, and can be further sped up more than twice by working with odds only:
import Data.Array.Unboxed primesSA :: [Int] primesSA = 2 : prs where prs = 3 : sieve prs 3 [] sieve (p:ps) x fs = [i*2 + x  (i,True) < assocs a] ++ sieve ps (p*p) ((p,0) : [(s, rem (yq) s)  (s,y) < fs]) where q = (p*px)`div`2 a :: UArray Int Bool a = accumArray (\ b c > False) True (1,q1) [(i,())  (s,y) < fs, i < [y+s, y+s+s..q]]
Runs significantly faster than TMWE and with better empirical complexity, of about O(n^{1.10..1.05}) in producing first few millions of primes, with constant memory footprint.
[edit] 6.2 Calculating Primes Upto a Given Value
Equivalent to Accumulating Array above, running somewhat faster (compiled by GHC with optimizations turned on):
{# OPTIONS_GHC O2 #} import Data.Array.Unboxed primesToNA n = 2: [i  i < [3,5..n], ar ! i] where ar = f 5 $ accumArray (\ a b > False) True (3,n) [(i,())  i < [9,15..n]] f p a  q > n = a  True = if null x then a' else f (head x) a' where q = p*p a' :: UArray Int Bool a'= a // [(i,False)  i < [q, q+2*p..n]] x = [i  i < [p+2,p+4..n], a' ! i]
[edit] 6.3 Calculating Primes in a Given Range
primesFromToA a b = (if a<3 then [2] else []) ++ [i  i < [o,o+2..b], ar ! i] where o = max (if even a then a+1 else a) 3 r = floor . sqrt $ fromIntegral b + 1 ar = accumArray (\a b> False) True (o,b) [(i,())  p < [3,5..r] , let q = p*p s = 2*p (n,x) = quotRem (o  q) s q' = if o <= q then q else q + (n + signum x)*s , i < [q',q'+s..b] ]
Although using odds instead of primes, the array generation is so fast that it is very much feasible and even preferable for quick generation of some short spans of relatively big primes.
[edit] 7 Using Mutable Arrays
Using mutable arrays is the fastest but not the most memory efficient way to calculate prime numbers in Haskell.
[edit] 7.1 Using ST Array
This method implements the Sieve of Eratosthenes, similar to how you might do it in C, modified to work on odds only. It is fast, but about linear in memory consumption, allocating one (though apparently packed) sieve array for the whole sequence to process.
import Control.Monad import Control.Monad.ST import Data.Array.ST import Data.Array.Unboxed sieveUA :: Int > UArray Int Bool sieveUA top = runSTUArray $ do let m = (top1) `div` 2 r = floor . sqrt $ fromIntegral top + 1 sieve < newArray (1,m) True  :: ST s (STUArray s Int Bool) forM_ [1..r `div` 2] $ \i > do isPrime < readArray sieve i when isPrime $ do  ((2*i+1)^21)`div`2 == 2*i*(i+1) forM_ [2*i*(i+1), 2*i*(i+2)+1..m] $ \j > do writeArray sieve j False return sieve primesToUA :: Int > [Int] primesToUA top = 2 : [i*2+1  (i,True) < assocs $ sieveUA top]
Its empirical time complexity is improving with n (number of primes produced) from above O(n^{1.20}) towards O(n^{1.16}). The reference C++ vectorbased implementation exhibits this improvement in empirical time complexity too, from O(n^{1.5}) gradually towards O(n^{1.12}), where tested (which might be interpreted as evidence towards the expected quasilinearithmic O(nlog(n)log(logn)) time complexity).
[edit] 7.2 Bitwise prime sieve with Template Haskell
Count the number of prime below a given 'n'. Shows fast bitwise arrays, and an example of Template Haskell to defeat your enemies.
{# OPTIONS O2 optcO XBangPatterns #} module Primes (nthPrime) where import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad import Data.Bits nthPrime :: Int > Int nthPrime n = runST (sieve n) sieve n = do a < newArray (3,n) True :: ST s (STUArray s Int Bool) let cutoff = truncate (sqrt $ fromIntegral n) + 1 go a n cutoff 3 1 go !a !m cutoff !n !c  n >= m = return c  otherwise = do e < unsafeRead a n if e then if n < cutoff then let loop !j  j < m = do x < unsafeRead a j when x $ unsafeWrite a j False loop (j+n)  otherwise = go a m cutoff (n+2) (c+1) in loop ( if n < 46340 then n * n else n `shiftL` 1) else go a m cutoff (n+2) (c+1) else go a m cutoff (n+2) c
And place in a module:
{# OPTIONS fth #} import Primes main = print $( let x = nthPrime 10000000 in [ x ] )
Run as:
$ ghc make o primes Main.hs $ time ./primes 664579 ./primes 0.00s user 0.01s system 228% cpu 0.003 total
[edit] 8 Implicit Heap
See Implicit Heap.
[edit] 9 Prime Wheels
See Prime Wheels.
[edit] 10 Using IntSet for a traditional sieve
See Using IntSet for a traditional sieve.
[edit] 11 Testing Primality, and Integer Factorization
See Testing primality:
[edit] 12 Oneliners
See primes oneliners.
[edit] 13 External links
 A collection of prime generators; the file "ONeillPrimes.hs" contains one of the fastest pureHaskell prime generators; code by Melissa O'Neill.
 WARNING: Don't use the priority queue from older versions of that file for your projects: it's broken and works for primes only by a lucky chance. The revised version of the file fixes the bug, as pointed out by Eugene Kirpichov on February 24, 2009 on the haskellcafe mailing list, and fixed by Bertram Felgenhauer.
 test entries for (some of) the above code variants.
 Speed/memory comparison table for sieve of Eratosthenes variants.