I'm trying to improve on examples from Clean Code* while re-implementing them in C++. This time it's the Sieve of Eratosthenes prime-computing example from pp. 71-74.
Below is the original adapted for C++, without improvements:
.h
class PrimeGenerator { public: PrimeGenerator() = default; ~PrimeGenerator() = default; static std::vector<unsigned> generatePrimes(unsigned maxValue); private: static void uncrossIntegersUpTo(unsigned maxValue); static void crossOutMultiples(); static unsigned determineIterationLimit(); static void crossOutMultiplesOf(unsigned i); static bool notCrossed(unsigned i); static void putUncrossedIntegersIntoResult(); static unsigned numberOfUncrossedIntegers(); static std::vector<bool> crossedOut; static std::vector<unsigned> result; };
.cpp
std::vector<bool> PrimeGenerator::crossedOut; std::vector<unsigned> PrimeGenerator::result; std::vector<unsigned> PrimeGenerator::generatePrimes(unsigned maxValue) { if (maxValue < 2) return {}; uncrossIntegersUpTo(maxValue); crossOutMultiples(); putUncrossedIntegersIntoResult(); return result; } void PrimeGenerator::uncrossIntegersUpTo(unsigned maxValue) { crossedOut = std::vector<bool>(maxValue + 1, false); crossedOut[0] = true; crossedOut[1] = true; } void PrimeGenerator::crossOutMultiples() { unsigned limit = determineIterationLimit(); for (size_t i = 2; i <= limit; ++i) { if (notCrossed(i)) crossOutMultiplesOf(i); } } unsigned PrimeGenerator::determineIterationLimit() { // Every multiple in the array has a prime factor that // is less than or equal to the root of the array size, // so we don't have to cross out multiples of numbers // larger than that root. double iterationLimit = std::sqrt(crossedOut.size()); return static_cast<unsigned>(iterationLimit); } void PrimeGenerator::crossOutMultiplesOf(unsigned i) { for (size_t multiple = 2 * i; multiple < crossedOut.size(); multiple += i) { crossedOut[multiple] = true; } } bool PrimeGenerator::notCrossed(unsigned i) { return !crossedOut[i]; } void PrimeGenerator::putUncrossedIntegersIntoResult() { result = std::vector<unsigned>(numberOfUncrossedIntegers()); size_t j = 0; for (size_t i = 2; i < crossedOut.size(); ++i) { if (notCrossed(i)) result[j++] = i; } } unsigned PrimeGenerator::numberOfUncrossedIntegers() { unsigned count = 0; for (size_t i = 2; i < crossedOut.size(); ++i) { if (notCrossed(i)) count++; } return count; }
What we see here is a static class with static functions and members. We don't like these in C++, so it seems this code could be better served with a namespace and some free functions. Let's try - my improvement attempt coming below.
.h
namespace PrimeGenerator { std::vector<unsigned> generatePrimes(unsigned maxValue); }
.cpp
namespace { std::vector<bool> uncrossIntegersUpTo(int maxValue) { std::vector<bool> crossedOut(maxValue + 1, false); crossedOut[0] = true; crossedOut[1] = true; return crossedOut; } unsigned determineIterationLimit(size_t size) { // Every multiple in the array has a prime factor that // is less than or equal to the root of the array size, // so we don't have to cross out multiples of numbers // larger than that root. double iterationLimit = std::sqrt(size); return static_cast<unsigned>(iterationLimit); } void crossOutMultiplesOf(unsigned i, std::vector<bool>& crossedOut) { for (size_t multiple = 2 * i; multiple < crossedOut.size(); multiple += i) { crossedOut[multiple] = true; } } void crossOutMultiples(std::vector<bool>& crossedOut) { unsigned limit = determineIterationLimit(crossedOut.size()); for (size_t i = 2; i <= limit; ++i) { if (!crossedOut[i]) crossOutMultiplesOf(i, crossedOut); } } std::vector<unsigned> putUncrossedIntegersIntoResult(const std::vector<bool>& crossedOut) { std::vector<unsigned> result; for (size_t i = 2; i < crossedOut.size(); ++i) { if (!crossedOut[i]) result.push_back(i); } return result; } } namespace PrimeGenerator { std::vector<unsigned> generatePrimes(unsigned maxValue) { if (maxValue < 2) return {}; auto crossedOut = uncrossIntegersUpTo(maxValue); crossOutMultiples(crossedOut); return putUncrossedIntegersIntoResult(crossedOut); } }
A quick summary of changes:
- I removed the class, leaving a single interface function in a PrimeGenerator
namespace.
- The numberOfUncrossedIntegers()
function didn't seem to make much sense, so I refactored putUncrossedIntegersIntoResult(...)
to get rid of the former.
- notCrossed(...)
would now need to have two parameters, so it stopped making sense either. It's gone now.
Now, I have two questions about my code. First of all, we now need to pass the crossedOut
vector around, which is a downside compared to the previous design. Would you propose an alternative solution to mitigate this? Secondly, are there any extra places where I should have used size_t
instead of unsigned
?
Cheers!
EDIT:
I'd like to stress I care more about good software engineering and coding style here rather than making this algorithm as fast as possible. Though of course it should be correct.
* Clean Code: A Handbook of Agile Software Craftsmanship, Robert C. Martin