identifiers
The literature tends to speak of "values" generated by the \$3n+1\$ process. A "seed" is used to start the process, and we see successive values grow from there. It is perfectly sensible to accept a seed
parameter. But during the growth loop, please use another name, perhaps n
.
annotation
It wouldn't hurt to point out we accept seed: int
, since a variant rational processes might perform the hailstone operation on a seed: Fraction
. (Though I confess the lack of from fractions import Fraction
is kind of a tip-off, here.)
It would also set you up for a numba JIT decorator.
single responsibility
Thank you for this beautiful docstring.
def collatz(seed): """Apply Collatz conjecture algorithm and stores all the values of the seed after every iteration to display them later"""
It is concise and accurate. It also helps us to see we are doing
- compute and
- store
Consider focusing on compute, so storage is someone else's problem.
from typing import Generator def collatz(seed: int) -> Generator[int, None, None]: """Generates hailstone values.""" n = seed while n > 1: if n % 2: n = 3 * n + 1 else: n //= 2 yield n def main(num_seeds: int = 1500) -> None: ... for seed in range(1, num_seeds + 1): y = list(collatz(seed)) ...
color maps
The cm.bwr
colormap is very nice. You might consider playing with cyclic colormaps, to highlight the chaotic non-linear nature of the data you're plotting. Or cycle through a categorical map.
You may find that plotting points rather than lines, and using a partly transparent beta, allows you to achieve higher visual data density.
caching
If you summarize a given seed as producing an (iterations, max_value)
tuple, you might find the @lru_cache or @cache decorators to be helpful; or introduce your own caching array. The literature sometimes uses delay
to describe number of iterations for a given seed. Leavens and Vermeulen did some interesting work in this space in 1992.
Of course, the calculations we're doing are so lightweight that time spent on cache checks threatens to swamp the time spent calculating. So we'd want to unroll a bit, perhaps by doing half a dozen "ordinary" function calls followed by one where we check the cache. Or equivalently, to handle every seventh seed, do a modulo congruent to zero test. Or perhaps use some bigger prime.