Lazy prime counting function [closed]

Question

I have written a simple snippet of code that implements a function that returns a list of primes up to a certain integer n. Assuming that this function could be called several times in the script for different values of n, I wanted to make it as "lazy" as possible—that is, able to reuse previously-calculated variables.

Here is the code (assume isPrime(n) returns True if n is prime and False otherwise):

primes = [] def pi(n): global primes if len(primes) == 0: # Full for i in range(2, n + 1): if isPrime(i): primes.append(i) return primes elif n <= primes[-1]: # Lazy return list(filter(lambda m: m <= n, primes)) else: # Semi-lazy for i in range(primes[-1] + 1, n + 1): if isPrime(i): primes.append(i) return primes

The problem is, as you may have spotted, that I'm using a global variable (primes) to store the list of primes, and later modifying it inside the pi function. I've heard that relying on mutable global variables is a bad design choice. What are better ways to implement this?

This question should actually be generalized to the following: what's the best design pattern to implement a 'lazy' function (i.e. a function that reuses previously calculated values)?

Unrelated to your original question, but even numbers can never be prime (2 notwithstanding) so the For i in range... isprime(i) can be put into a statement to check if i is even, saving half your calculations. I also seem to remember a way to make range() skip numbers too. — DJHenjin, CommentedAug 28, 2018 at 3:34
list(filter(lambda...)) can be inefficient too. Consider generating a few hundred million primes, and then asking for the list of primes up to 100. When the filter() reaches 101, it could stop, but won’t since it doesn’t know that. itertools.takewhile() would be an improvement. Better still would be bisect to find the endpoint, and then simply returning a slice. — AJNeufeld, CommentedAug 28, 2018 at 17:42
@AJNeufeld Thank you for your suggestions; I realized the problem with filter but didn't know/hadn't made research about better-fitting alternatives when quickly putting this together. I think in this case I'd use takewhile; it seems more 'human-readable' and straightforward to me. — Anakhand, CommentedAug 29, 2018 at 18:10
return primes[:bisect.bisect(primes, n)]. Just so you know. — AJNeufeld, CommentedAug 29, 2018 at 18:43

Hans Musgrave · Accepted Answer · 2018-08-28 02:42:52Z

The technique you're looking for is called memoization (or in other contexts it might have names like caching). Lazy evaluation is something else entirely.

Memoization is the canonical example for how decorators work in Python, but you don't need to learn the additional language construct if you don't like. Consider the following snippet. It could be more general, but that's beside the point.

def memoize(f): cache = {} def _f(n): if n not in cache: cache[n] = f(n) return cache[n] return _f

What we've created is a function that takes another function as an argument. How do we use it? Well, we pass a function into it.

def pi(n): # do stuff pi = memoize(pi)

Take a close look at this.

First we make our normal function that deals with primes.
Next, we pass it into this memoize() monster that we just made.
When memoize() sees this function (now just called f), it makes a cache and defines a new function _f(). It's this new function _f() that's finally returned (i.e., at the end of all this, pi is the same as whatever _f happened to be), so let's look closely at what _f() does.
1. First, _f() takes a look in the cache to see if we've already seen the value n. If not, it stores the output f(n), i.e. the original pi(n) we wanted, in the cache.
2. Next, it returns the pre-computed value straight from the cache. That's it.

Now we've eliminated the use of globals (behind the scenes the cache thing we described is stored in what's called the closure of _f), we've taken the problem of caching away from pi() to give pi() a small, single purpose, and we've created a reusable memoization mechanism that we can apply to any other function that takes exactly one hashable input.

Back to the decorator thing I mentioned earlier, you could just as well write

@memoize def pi(n): # do stuff

to memoize the pi function instead of applying the memoizer after the fact. It doesn't make much of a difference either way, but the latter is considered more pythonic.

To be able to take multiple arguments, you can start to do things with *args and **kwargs in your memoizer, but that's additional complexity without a lot of added benefits for the sake of this problem.

Thank you, this is exactly what I was striving for. There is one thing I don't understand about how the decorator works, though: if the first statement of memoize is the assignment cache = {}, how does it 'keep track' of the actual cache—instead of resetting it every time you call the function? Does the modified pi function have access to the scope of memoize even after memoize has terminated? — Anakhand, CommentedAug 28, 2018 at 6:41
Yes it does. As a rule of thumb, anything that you define in an outer scope (like memoize()) and then use in the inner scope (like _f()) will be available to that inner scope. Exceptions to that rule of thumb are unnatural and unlikely to cross your path. That concept is called the closure of _f, and it is actually an attribute of the object _f and pi reference. In Python 2, it looks like pi.func_closure or pi.__closure__, and I think the latter works in Python 3 as well. — Hans Musgrave, CommentedAug 28, 2018 at 7:19
Most languages that I know about have some concept of closure. It's part of a mathematical concept called "lambda calculus" that modern functional programming is loosely based on. — Hans Musgrave, CommentedAug 28, 2018 at 7:20

Stack Exchange Network

Lazy prime counting function [closed]

1 Answer 1

Hot Network Questions

Lazy prime counting function [closed]

1 Answer 1

Related

Hot Network Questions