Prime Number Searches — Functional and Imperative

tl;dr Implementations of the Sieve of Eratosthenes prime number search algorithm are often given as functional programming language examples to show off their brevity. However, closer inspection reveals that some implementations are actually not the Sieve but use trial by division. I compare prime number searches using imperative Sieve, functional trial by division, and functional Sieve implementations in Common Lisp. The imperative implementation wins in terms of raw performance, while with a suitable algorithm the functional version is brief and its performance is acceptable. However, the Eratosthenes Sieve is an explicitly imperative algorithm and is a poor fit for FP; my functional solution is neither small nor elegant, and its performance is terrible. Careful choice of data structures may alleviate this at the expense of complexity.

Edit: (2013/2/26) See my later blog post . for an update, where I revisit the problem with different functional approaches that offer better performance and scalability while still following the Eratosthenes Sieve approach.

I’m currently exploring functional programming using Common Lisp as a vehicle. An example that is sometimes held up to show off power and brevity of functional programming languages is a prime number search by the Sieve of Erathosthenes. An example on the Miranda home page is given as:

        
    primes = sieve [2..]
             where
             sieve (p:x) = p : sieve [n | n <- x; n mod p > 0]

Miranda is a lazy evaluation, pure functional language. However, I’m currently trying to make Common Lisp my language of choice, and I attempted to create imperative and functional versions of the Sieve of Eratosthenes for comparison.

However, while I was digging through Rosetta Code (http://rosettacode.org/wiki/Sieve_of_Eratosthenes) to see how others tackled the problem in eager evaluation functional languages, I came across a few implementations that had been highlighted because they did not actually implement the Sieve of Erathosthenes algorithm but rather used trial division. My suspicions were aroused and looking more closely at the example on the Miranda homepage, yes is uses trial division, so is not the bona fide Sieve. The Sieve of Eratosthenes is an algorithm—a description of a method for solving a problem—so implementing another method and then claiming it to be the Sieve of Eratosthenes is false even if it achieves the same result.

I therefore set about to create imperative and two functional versions, one using trial division and the other the Sieve of Eratosthenes algorithm, for comparison.

Edit: Melissa E. O’Neill has also noticed this, and pipped me to the post with a sieve in Haskell. Check out her paper at http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf.

Sieve of Eratothenes

As I understand it, the Sieve of Eratothenes algorithm to find primes up to some number n is basically as follows:

  • Create a table of prime number candidates from 2 to n.
  • Starting with 2, proceed to “cross out” multiples of 2 from the table. (Since they’re multiples of a prime they can’t ipso facto be prime themselves so can be eliminated from consideration.)
  • The next number in the table that hasn’t been crossed out is the next prime. Record it, then cross out its multiples from the table.
  • Rinse and repeat till you run out of candidate numbers.

A permissible optimisation (for Rosetta Code entries) is, for prime k, to start the crossing out with the square of k, since smaller multiples will have been taken care of by the earlier crossings out.

Imperative version

The Sieve of Eratosthenes is straightforward to implement in an imperative language, including BASIC and assembly language. In Common Lisp:

 1) (defun imperative-prime (n)
 2)    "Sieve of Eratosthenes to find primes up to n.
 3)     Imperative implementation."
 4)
 5)    (let ((sieve (make-array (1+ n) :element-type '(unsigned-byte 8)
 6)                                    :initial-element 0)))
 7)
 8)      (loop for i from 2 upto n when (zerop (aref sieve i)) do
 9)        (loop for j from (* i i) upto n by i do (setf (aref sieve j) 1)))
10)
11)      (loop for i from 2 upto n when (zerop (aref sieve i)) collect i)))

Lines 5–6 create an array to hold our “crossing out” table, initialised to all zeros (no values crossed out). The only slight complication is that in CL the first value in the array has the index 0, so I create an extra element so that the table entry for number k has the index k rather than k-1. Lines 8–9 walk through the table crossing out multiples of each prime found by setting them to 1, and line 11 collects all the numbers for which the table entry is 0 (the primes) into a list. Eleven lines of code, including comments and blanks for readability.

Functional Version 1 (trial division)

My first stab at a functional version is also short and simple.

 1) (defun functional-prime-1 (n)
 2)  "Find primes up to n.
 3)   Functional implementation using trial division."
 4)
 5)   (labels ((x-primes (primes candidates)
 6)     (if (null candidates)
 7)        primes
 8)        (let ((pr (car candidates)))
 9)          (x-primes (cons pr primes)
10)                     (remove-if #'(lambda (x) (zerop (rem x pr))) 
11)                                  (cdr candidates)))))))
12)
13)      (nreverse 
14)        (x-primes nil (loop for i from 2 upto n collect i)))))

The syntax of Lisp can make this hard to follow but the approach is classic recursion. Lines 5–11 define a local auxiliary function that takes as its arguments two lists: a list of primes we’ve found so far and a list of candidate numbers. We first test whether the list of candidates is empty (line 6) and if it is, we simply return the list of primes (line 7). Otherwise, we take the next prime number from the list of candidates, pr (line 8), then call the auxiliary function again with pr tacked onto the primes list (line 9) and the list of candidates with multiples of pr removed (line 10). Finally, lines 12–13 kick off the recursive function with initial values of the primes list (empty) and the candidates (natural numbers from 2 to n). A small detail is that because it’s much quicker to add an item to the front of a list than the back, the list of primes is built backwards so line 12 reverses it before returning it to the caller.

The same algorithm can be expressed in Miranda (which I personally find easier to grok):

x_primes :: [num] -> [num]
x_primes primes, []      = primes
x_primes primes, [x:xs]  = x_primes( x ++ primes, [k | k <- xs; k mod x > 0])

functional_prime_1 :: num -> [num]
functional_prime_1 n = reverse( x-primes([], [1..n]) )

Elegant and beautiful, no? Maybe, but is is not the Sieve; it’s using trial division to remove candidates, rather than crossing off multiples in a list, and testing every candidate.

Functional Version 2 (Sieve?)

This is my attempt to implement the Sieve in a functional style using Common Lisp. I aim for simplicity and correctness, at the expense of some optimisations; for example, I choose to represent the sieve as a list rather than an array.

For each prime k, we need to “set” the list elements corresponding to multiples of k to 1 (starting at k2). The data are immutable, so I chose a recursive approach, at each step creating a new sieve for each prime with just its multiples set, then combining it with the sieve from the previous step before passing it to the next iteration. Slightly differing from the iterative version, I also cross out the prime itself, and accumulate primes in a separate list. That way, finding the next prime simply consists of searching for the first element of the sieve that is 0.

For example, the initial sieve is all zeros except for the first element that corresponds to the natural number 1.

primes -> ()
sieve  -> (1 0 0 0 0 0 0 0 0 0 ...)

The first prime is the first element set to zero, which is the second in the list and so the number two. We add this to the list of primes, then generate a sieve for 2 and its multiples starting at 4 (22).

primes  -> ( 2 )
2-sieve -> (0 1 0 1 0 1 0 1 0 1 ...)
sieve   -> (1 0 0 0 0 0 0 0 0 0 ...)

We then recursive with the primes list and a new sieve generated by the logical or of the original sieve and the 2-sieve

primes -> ( 2 )
sieve  -> (1 1 0 1 0 1 0 1 0 1 ...)

The condition to terminate recursion is when there are no more 0s in the sieve, in which case we return the list of primes.

Now for the implementation. Say we want to find the primes less than some number n. We first create a function that, given a prime k, generates a list of sieve elements to “set”.

 1) (defun build-multiples-list (k n) 
 2)   "Generate a list consisting of the number k, its square, then
 3)    each multiple of k, up to n. The generated list is backwards."
 4)
 5)  (labels ((x-multiple (acc nxt)
 6)    (if (> nxt n) 
 7)        acc
 8)        (x-multiple (cons nxt acc) (+ k nxt)))))
 9)
10)    (x-multiple (list k) (* k k))))

I could use the mighty loop macro here but I want to use recursion. Lines 5–8 define an auxiliary function that takes as its arguments an accumulator holding the list so far, and the next value to add to the list. We first check if the next value is greater than n, the limit of our primes search (line 6) and if it is, we return the list of accumulated values (line 7). Otherwise, we recurse with the value prepended to the accumulator and the next multiple of k (line 8). The auxiliary function is invoked with k itself in the accumulator and the square of k as the first value (line 10). Because we prepend to the list, the list will be backwards.

We test it as follows:

CL-USER> (nreverse (build-factors-list 3 100))
(3 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99)

We now need to build a sieve corresponding to the list of multiples. Consider the list of multiples for 3, which is (3 9 12 15 …). The 3-sieve is

(0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 ...)

We observe that this is the concatenation of a number of smaller lists consisting of a sequence of 0s followed by a 1, i.e.

(0 0 1) ++ (0 0 0 0 0 1) ++ (0 0 1) ++ (0 0 1) ++  ...

where ++ denotes list concatenation. The length of each sublist in the sieve is simply the difference between successive multiples. Let’s create a function that, given a list of multiples, returns the length of each sublist in the k-sieve.

 1) (defun get-subsieve-lengths (vals)
 2)    (labels ((x-subsieve (outlst inlst)
 3)                (if (null (cdr inlst))
 4)                    outlst
 5)                    (x-subsieve (cons (- (cadr inlst) (car inlst)) 
 6)                                      outlst) 
 7)                                (cdr inlst)))))
 8)
 9)        (nreverse (x-subsieve nil (cons 0 vals)))))

Let’s test this. Recalling that the list of multiples to set in the 3-sieve is (3 9 12 15…)

CL-USER> (get-subsieve-lengths (nreverse (build-multiples-list 3 100)))
(3 6 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3)

which gives is the value of the first element of our list of multiples, followed by the differences between successive elements in the list.

Constructing a k-sieve is accomplished by the following functions:

 1) (defun gimme-n-of (what n)
 2)  "Return a list of n whats"
 3)  (loop repeat n collect what))
 4)
 5) (defun build-sieve (k n)
 6)  "Construct the 'cross-off' list (Eratosthenes sieve) for prime k."
 7)
 8)  (flet ((build-sub (len)
 9)    "Build a list of len-1 zeros and a 1 on the end"
10)     (nreverse (cons 1 (gimme-n-of 0 (1- len))))))
11)
12)       (let* ((multiples     (build-multiples-list k n))
13)              (last-multiple (car multiples))
14)              (sieve         (mapcan #'build-sub (get-subsieve-lengths (nreverse multiples)))))
15)
16)       (if (= n last-multiple)
17)           sieve
18)           (nconc sieve (gimme-n-of 0 (- n last-multiple)))))))

gimme-n-of is simply a convenience function that returns a list consisting of a specified number of constant elements. The sieve itself is constructed at line 14 through a pipeline of functions, using a local function build-sub (lines 8–10) to construct each subsieve. A minor complication is the we need to add zeros to the end of the sieve after the last subsieve to bring us up to the total sieve length (lines 16–18). The number of zeros we need to add is the search limit n minus the last multiple of k, extracted at line 13. (This is why I don’t reverse the multiples list before returning it — it makes it more efficient to get the “last” value.)

We now have the bits in place to build a functional Eratosthenes sieve. This function itself is short and straightforward.

 1)  (defun functional-prime-2 (n)
 2)   "Find primes up to n.
 3)    Functional implementation using Sieve of Eratosthenes."
 4)
 5)   (labels ((find-next-prime (primes sieve)
 6)              (let* ((pos (position 0 sieve)))
 7)                 (if (null pos)
 8)                     primes
 9)                     (let ((next-prime (1+ pos)))
10)                       (find-next-prime (cons next-prime primes)
11)                                        (mapcar #'logior (build-sieve next-prime n) 
12)                                                         sieve)))))))
13) 
14)      (nreverse 
15)        (find-next-prime nil (cons 1 (gimme-n-of 0 (1- n)))))))

And let’s check its output:

CL-USER> (functional-prime-2 100)
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97)

However, we expect performance to be terrible — we’re creating two sieves at each iteration, each a list of boxed values, involving a lot of consing and creating a lot of garbage in the process. Further, on each iteration we’re searching the sieve for the first zero, which is getting further and further away, each time walking down the list from the start. How bad will it be? Let’s find out.

Performance

We now have three programs to find prime numbers up to a given number. Let’s try some benchmarking. For reference, the machine is an Intel Core i5 780 2.80 GHz with 8GB RAM, running SBCL on Microsoft Windows 7 64-bit. No special optimisation is used.

We’ll measure the time it takes for 100,000 runs of finding the primes to 1,000. First, the imperative version:

CL-USER> (time (loop repeat 100000 do (imperative-prime 1000)))
Evaluation took:
  2.195 seconds of real time
  2.199614 seconds of total run time (2.152814 user, 0.046800 system)
  [ Run times consist of 0.079 seconds GC time, and 2.121 seconds non-GC time. ]
  100.23% CPU
  6,162,712,538 processor cycles
  236,807,008 bytes consed

Finding primes to 1,000 by our imperative method takes about 22 microseconds. The garbage collection and consing are probably to do with building the output primes list.

Now the trial division recursive version:

CL-USER> (time (loop repeat 100000 do (functional-prime-1 1000)))
Evaluation took:
  37.963 seconds of real time
  37.783443 seconds of total run time (37.580641 user, 0.202802 system)
  [ Run times consist of 2.188 seconds GC time, and 35.596 seconds non-GC time. ]
  99.53% CPU
  106,557,807,692 processor cycles
  13,168,932,696 bytes consed

About 380 microseconds to find the primes to 1,0000: about 17 times slower than the imperative version but still not bad in absolute terms. Garbage collection is less than 1% of runtime.

What about the functional Eratosthenes Sieve?

CL-USER> (time (loop repeat 100000 do (functional-prime-2 1000)))
Evaluation took:
  575.893 seconds of real time
  574.380082 seconds of total run time (570.636058 user, 3.744024 system)
  [ Run times consist of 36.774 seconds GC time, and 537.607 seconds non-GC time. ]
  99.74% CPU
  1,616,532,834,439 processor cycles
  274,064,489,504 bytes consed

About 5.8 milliseconds: 15 times slower than the trial by division functional version, 260 times slower than the imperative version. Also, the garbage collector kicked in for about 6% of the runtime and there was a lot of consing, as expected.

For our functional Sieve, using alternative data structures (arrays of unboxed integers instead of lists for the sieves) and reducing the time to search for the next prime by for example truncating the sieve at the last prime found may speed things up, at the expense of implementation complexity, but there is still a large amount of data copying going on.

(In contrast, on the 6MHz 68000-based system I built in the mid-1980s, the Sieve of Eratosthenes written in assembler took about 100ms to calculate the primes to 1,000.)

Conclusions

Functional programming can be regarded as a series of operations on data, with the flow of data through a processing pipeline of functions being emphasised. Where it tends to fall down is where we are required to apply repeated change to large collections of data, when the “no mutable data” means that in-place updates of items is eschewed and leads to huge amounts of copying instead. Where an algorithm (or data structure) that is a good fit for the functional paradigm exists (trial by division here) then the performance of the functional version is acceptable. However, the Sieve of Erathosthenes is an explicitly imperative algorithm, and attempting a naive functional implementation can be time consuming and the result is slow; it’s probably not worth the effort.

About these ads

8 thoughts on “Prime Number Searches — Functional and Imperative

  1. russe

    You may want to look at FSet, a Common Lisp library which contains a functional map with O(lg(N)) time complexity. The fundamental data structure of the Sieve is an array, which can be modeled with it. I think that the penalty for a functional approach will be algorithmically lower.

    Reply
  2. Pingback: Faster Primes with Heaps :: nklein software

  3. markbrown778 Post author

    Working on performance measurements, not for a given number of primes but primes up to a certain number. The imperative sieve scales approximately linearly with a slight upcurve but runs out of memory quickly. The trial by division version’s runtime grows exponentially – there may be an optimisation I can make to reduce consing but how much it’ll help I don’t know. I’ve worked out a modified algorithm for the functional sieve that should substantially reduce copying and be scalable in terms of memory too. Will report back when it’s done!

    Reply
  4. notme

    we can always recalculate from one to another. If you calculate primes up to m1,m2 in time t1,t2, and also find out the count of primes n1,n2 in each run, then it is up to you whether to calculate log(t2/t1)/log(m2/m1) or log(t2/t1)/log(n2/n1). To limit the memory requirements you could change all your tests to just count the primes instead of collecting all their actual values. This is enough for testing the validity, because these counts are known and can be compared with e.g. via WolframAlpha site.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s