My solution with a custom iterator runs almost 2x faster. Your main bottleneck is the multi_cartesian_product
iterator that produces results on the heap (Vec<usize>
). There is no inherent need for getting the factors through the heap.
Speedup for LIMIT=500_000:
test tests::bench_factors ... bench: 250,582,325 ns/iter (+/- 1,694,499) test tests::bench_factors2 ... bench: 445,134,232 ns/iter (+/- 2,520,121)
Solution with the custom CombineProducts
:
#![feature(test)] extern crate test; #[macro_use] extern crate lazy_static; use itertools::Itertools; const LIMIT: usize = 500_000; lazy_static! { static ref SIEVE: primal::Sieve = primal::Sieve::new(LIMIT); } struct CombineProducts { factors: Vec<Factor>, current_product: usize, exhausted: bool, } struct Factor { base: usize, exp: usize, current_exp: usize, } impl CombineProducts { fn new(factors: Vec<(usize, usize)>) -> CombineProducts { CombineProducts { factors: factors.into_iter().map(|(base, exp)| Factor { base, exp, current_exp: 0, }).collect(), current_product: 1, exhausted: false, } } } impl Iterator for CombineProducts { type Item = usize; fn next(&mut self) -> Option<Self::Item> { if self.exhausted { return None; } let result = self.current_product; for factor in &mut self.factors { if factor.current_exp == factor.exp { factor.current_exp = 0; self.current_product /= factor.base.pow(factor.exp as u32); } else { factor.current_exp += 1; self.current_product *= factor.base; return Some(result); } } self.exhausted = true; Some(result) } } fn run() -> Vec<(usize, usize)> { let mut result = vec![]; for n in 2..LIMIT { result.extend( CombineProducts::new( SIEVE .factor(n) .unwrap() ).map(|d| (d, n / d)) ); } result } fn run_print() -> Vec<(usize, usize)> { let mut result = vec![]; for n in 2..LIMIT { result.extend( CombineProducts::new( SIEVE .factor(n) .unwrap() ).map(|d| (d, n / d)) ); } println!("{:?}", result.len()); result } fn run2() -> Vec<(usize, usize)> { let mut result = vec![]; for n in 2..LIMIT { result.extend( SIEVE .factor(n) .unwrap() .into_iter() .map(|(base, exp)| (0..=(exp as u32)).map(move |e| base.pow(e))) .multi_cartesian_product() .map(|v| { let d = v.into_iter().product::<usize>(); (d, n / d) }) ); } result } fn run2_print() -> Vec<(usize, usize)> { let mut result = vec![]; for n in 2..LIMIT { result.extend( SIEVE .factor(n) .unwrap() .into_iter() .map(|(base, exp)| (0..=(exp as u32)).map(move |e| base.pow(e))) .multi_cartesian_product() .map(|v| { let d = v.into_iter().product::<usize>(); (d, n / d) }) ); } println!("{:?}", result.len()); result } #[cfg(test)] mod tests { use super::*; use test::bench::Bencher; #[test] fn test_run() { run_print(); } #[test] fn test_run2() { run2_print(); } #[bench] fn bench_factors(b: &mut Bencher) { b.iter(|| { run() }); } #[bench] fn bench_factors2(b: &mut Bencher) { b.iter(|| { run2() }); } }
n
from 2 toLIMIT
just for a benchmark? If you wanted to do that for real, then that structure can be exploited to save some redundant work.\$\endgroup\$d = max(d1, d2)
. As soon asd * d > n
you can stop. It's impossible for one ofn
's factors to exceedsqrt(n)
.\$\endgroup\$√n
, includingn
itself. But they can all be found by considering the smaller factors, e.g. by producing{d1, d2}
and{d2, d1}
together, except when they are equal (d1 == d2 == √n
).\$\endgroup\$println
. This suggest that you should rethink your benchmarking strategy.\$\endgroup\$