5
\$\begingroup\$

I implemented a simple function that generated random bytes Vec<u8> with given n size. Then I wondered how can I optimize this.

Here is the first implementation:

use rand::{Rng, RngCore, SeedableRng}; use rand_chacha::ChaCha20Rng; fn random_bytes(n: usize) -> Vec<u8> { let mut bytes = Vec::with_capacity(n); let mut rng = ChaCha20Rng::from_entropy(); for _ in 0..n { let b = rng.gen::<u8>(); bytes.push(b); } bytes } 

I wanted to go through unsafe path to make things interesting. First thing that I came up with is to make allocated memory uninitialized. Using Box::<[T]>::new_uninit_slice which happens to be stabilized in 1.82.0.

Second thing that came to my mind is to fill the bytes in chunks. RngCore trait provides next_u64 method which returns a random u64. But I could not figure out how to handle remainders. (e.g. 13 bytes is not multiple of 8 bytes leaving 5 byte chunk at the end). So I relied on using RngCore::fill_bytes method.

Here is the unsafe implementation:

fn unsafe_random_bytes(n: usize) -> Vec<u8> { let mut uninit_boxed = Box::<[u8]>::new_uninit_slice(n); let bytes = { let ptr = uninit_boxed[0].as_mut_ptr(); unsafe { std::slice::from_raw_parts_mut(ptr, n) } }; let mut rng = ChaCha20Rng::from_entropy(); rng.fill_bytes(bytes); unsafe { uninit_boxed.assume_init().into_vec() } } 

There are two unsafe blocks, first is for creating mutable slice from pointer and second is the assuming the memory initialized. I'm not sure if there is any UBs. But it seems to be working. In my opinion, since box ensures that n amount of type T is allocated. We can construct a mutable slice from first element address, since MaybeUninit and ManuallyDrop is #[repr(transparent)].

fill_bytes from rand::RngCore guarantees that the destination is filled with new data. Thus, assuming initialized for the second unsafe block also ok.

Here is the comparison for both functions using criterion crate:benchmark

Let me know what your thoughts are. I may be missing something or don't have the knowledge.

\$\endgroup\$
1
  • \$\begingroup\$Your code has UB because from_raw_parts_mut requires that all elements are initialized.\$\endgroup\$CommentedFeb 24 at 13:28

1 Answer 1

2
\$\begingroup\$

Design

Your functions instantiate a new RNG on each call. It'd be cheaper to re-use an existing RNG.

While at it, you can use extension traits to provide the respective methods for RNGs.

Performance

Your safe approach is slow because you generate each byte one by one and push it to the Vec. Using fill_bytes improves performance a lot, even when not using unsafe.

Safety

You did not provide any // SAFETY: ... comment. Why do you assume the operations are safe? I'm not saying they are not, but you did not provide any reasoning as to why they are.

Food for thought: What happens with the uninit slice when your code panics in-between the two unsafe statements?

Alternative implementation

use rand::{Rng, RngCore}; pub trait RandomBytes { fn random_bytes(&mut self, n: usize) -> Vec<u8>; } impl<T> RandomBytes for T where T: Rng, { fn random_bytes(&mut self, n: usize) -> Vec<u8> { let mut bytes = Vec::with_capacity(n); for _ in 0..n { let b = self.gen::<u8>(); bytes.push(b); } bytes } } pub trait UnsafeRandomBytes { fn unsafe_random_bytes(&mut self, n: usize) -> Vec<u8>; } impl<T> UnsafeRandomBytes for T where T: RngCore, { fn unsafe_random_bytes(&mut self, n: usize) -> Vec<u8> { let mut uninit_boxed = Box::<[u8]>::new_uninit_slice(n); let bytes = { let ptr = uninit_boxed[0].as_mut_ptr(); unsafe { std::slice::from_raw_parts_mut(ptr, n) } }; self.fill_bytes(bytes); unsafe { uninit_boxed.assume_init().into_vec() } } } pub trait AltRandomBytes { fn alt_random_bytes(&mut self, n: usize) -> Vec<u8>; } impl<T> AltRandomBytes for T where T: RngCore, { fn alt_random_bytes(&mut self, n: usize) -> Vec<u8> { let mut bytes = vec![0; n]; self.fill_bytes(&mut bytes); bytes } } 

Benchmark

use criterion::{criterion_group, criterion_main, BenchmarkId, Criterion}; use rand_chacha::rand_core::SeedableRng; use rand_chacha::ChaCha20Rng; use random_bytes::{AltRandomBytes, RandomBytes, UnsafeRandomBytes}; fn bench_random_bytes(c: &mut Criterion) { let mut group = c.benchmark_group("Random"); for i in [10usize, 100, 1000, 2000, 4000, 8000, 16_000] { group.bench_with_input(BenchmarkId::new("Safe", i), &i, |b, i| { let mut rng = ChaCha20Rng::from_entropy(); b.iter(|| RandomBytes::random_bytes(&mut rng, *i)) }); group.bench_with_input(BenchmarkId::new("Unsafe", i), &i, |b, i| { let mut rng = ChaCha20Rng::from_entropy(); b.iter(|| UnsafeRandomBytes::unsafe_random_bytes(&mut rng, *i)) }); group.bench_with_input(BenchmarkId::new("Safe alt", i), &i, |b, i| { let mut rng = ChaCha20Rng::from_entropy(); b.iter(|| AltRandomBytes::alt_random_bytes(&mut rng, *i)) }); } group.finish(); } criterion_group!(benches, bench_random_bytes); criterion_main!(benches); 

Benchmark plot

You be the judge, as to whether the performance difference justifies the use of unsafe here.

\$\endgroup\$

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.