7
\$\begingroup\$

Given

void foo(std::vector<std::string>& vec, size_t target, std::mt19937& rng) { std::uniform_int_distribution dist; while(vec.size() < target) { auto x = std::to_string(dist(rng)); std::cout << x << '\n'; vec.insert(vec.begin(), x); } } 

I want to rewrite this such that it

  1. Doesn't change the semantics, including the values of pseudo-random x
  2. Doesn't insert into the front of vec many times, because that's horrible
  3. Isn't overly complicated language-wise, because it will be read by the original author who isn't an expert in C++
  4. Preferably doesn't do index arithmetic, because that is very error prone

The best I can think of is

void bar(std::vector<std::string>& vec, size_t target, std::mt19937& rng) { if(vec.size() >= target) return; auto n = target - vec.size(); std::uniform_int_distribution dist; std::vector<std::string> xs; xs.reserve(n); std::generate_n(std::back_inserter(xs), n, [&]{ auto x = std::to_string(dist(rng)); std::cout << x << '\n'; return x; }); vec.insert(vec.begin(), std::make_move_iterator(xs.rbegin()), std::make_move_iterator(xs.rend())); } 

But I have to admit, even though I know C++ well, bar looks way worse than foo. It's absolutely horrendous for someone who doesn't. Is there a better way?

\$\endgroup\$
7
  • 2
    \$\begingroup\$This is a StackOverflow question, not a code review request. “What is the best way to insert to the front of a vector” is clearly nothing more than a best-practices question, not an existing project. It is literally hypothetical code, even straight-up using foo and bar.\$\endgroup\$
    – indi
    CommentedFeb 22 at 17:05
  • \$\begingroup\$@indi "clearly nothing more than a best-practices question, not an existing project. It is literally hypothetical code" obviously I didn't copy and paste from company code, if that's what you think constitutes "real code". I can assure you this is not hypothetical.\$\endgroup\$
    – 友人A
    CommentedFeb 24 at 2:36
  • \$\begingroup\$It really doesn’t assure me that this code is not hypothetical when you straight-up admit that it is. The project that contains the code that inspired this code may not be hypothetical… but this code is not from any real project. To understand why that matters, you need to understand what a code review is. A code review is an inspection of a project’s code to ensure it is safe and has no defects. For that to work, reviewers necessarily need to see the actual code from the actual project, including all relevant context… not just a snippet that captures the gist of it. 1/2\$\endgroup\$
    – indi
    CommentedFeb 27 at 2:07
  • \$\begingroup\$Surely you can understand why that is. Would you trust that a plane is safe to fly in if inspectors did not inspect the actual plane, and instead inspected a different plane that just more or less looks the same? Pulling out “the important bit” and/or stripping it down to a minimum working example is exactly the right thing to do when asking a StackOverflow question… and exactly the wrong thing to do when requesting a code review. That is why this is not a code review request; it is a StackOverflow question. 2/2\$\endgroup\$
    – indi
    CommentedFeb 27 at 2:08
  • \$\begingroup\$@indi In your interpretation of code review, the only possible question is a complete program. That is not what code review, as the term used colloquially, means. In fact, a complete program is still part of a whole, why do we exclude the platform, the hardware, the power plant that the program depends on to ensure the safety/correctness of the code? If a complete program is what this website asks for, can you point me to the place that documents this requirement?\$\endgroup\$
    – 友人A
    CommentedMar 2 at 11:25

5 Answers 5

8
\$\begingroup\$

If you need to insert at one end and read from the other, use a std::dequeue.

If you genuinely need a std::vector, store the array values in reverse order. Add new elements to the end, in constant time, and read them back with a reverse iterator. All the calls to the PRNG happen in the same sequence, so you get the side-effects you wanted.

If you genuinely need a std::vector stored in that exact order in memory, maybe because you’re going to iterate over it for SIMD: build the array in reverse order, as before, then reverse the array in O(N) time.

\$\endgroup\$
3
  • \$\begingroup\$You might find dequeue to be slower than vector even for tasks like this. Definitely time it in your use case, don’t assume it’s the better choice.\$\endgroup\$CommentedFeb 21 at 13:35
  • 1
    \$\begingroup\$@CrisLuengo If dequeue has a block implementation, that would depend on the number and size of blocks. Another data structure that would work well for a FIFO is: implement as a vector, add elements to the end, and consume them from the front by incrementing the begin() pointer. Both operations take constant time, so long as you reserve sufficient memory. When you run out of space and need to reallocate, you then move all elements back to the start of the block, reclaiming the storage of the consumed elements. In many cases, you will empty the FIFO queue, allowing you to start over.\$\endgroup\$
    – Davislor
    CommentedFeb 21 at 14:29
  • \$\begingroup\$While possible, using a different type is a much less local change (vec is passed around a lot) and I hesitate to do that.\$\endgroup\$
    – 友人A
    CommentedFeb 24 at 3:06
5
\$\begingroup\$

So there are three properties you'd ideally have:

  1. A simple syntax to insert things in reverse order into a container.
  2. Have side effects executed in the expected order.
  3. Not have unnecessary overhead, such as repeated single insertions or using intermediate storage.

Currently, you cannot do 2 and 3 at the same time with a std::vector. It would require something like insert_range_reversed() to do this, and I don't see this ever being added to the standard library. So, you have to settle for one of the possible compromises:

  • Use a temporary container. This increases memory usage and adds some CPU overhead, but at least it's all linear complexity. This is your bar().
  • Insert default-constructor values at the start of the vector, then overwrite them in reverse order. This might still be possible without index operations.
  • Use something other than std::vector to store your strings in. std::deque has already been suggested.

Note that depending on the type of value stored in the container, each of these solutions might cause different side effects.

As for the first property: you can always make a nice generic functions that abstracts away what you want.

For a container with std::string, I think inserting default-constructed elements at the start of vec would be the most efficient solution, as that won't have any side effects of itself. A generic function that prepends elements to another container using a generator function could look like this:

template<typename R, typename F> auto prepend_reversed(R& container, std::size_t count, F&& generator) { container.insert(std::begin(container), count, {}); auto first = std::rend(container); std::advance(first, -count); std::generate_n(first, count, std::forward<F>(generator)); } 

Then call it like so:

void bar(std::vector<std::string>& vec, size_t target, std::mt19937& rng) { … prepend_reversed(vec, n, [&]{ auto x = std::to_string(dist(rng)); std::cout << x << '\n'; return x; }); } 
\$\endgroup\$
5
  • \$\begingroup\$This doesn't insert the elements in reverse order, which is a requirement\$\endgroup\$
    – Caleth
    CommentedFeb 21 at 10:10
  • \$\begingroup\$Oh! I totally missed that. let me think how to handle that...\$\endgroup\$CommentedFeb 21 at 10:12
  • \$\begingroup\$You can do 2 and 3 at the same time: reverse the array in-place, append, reverse again. This has already been suggested by Davislor.\$\endgroup\$CommentedFeb 21 at 16:39
  • \$\begingroup\$@CrisLuengo That's likely way more expensive. I would only do that if the value type would not have a default constructor or if it had undesirable side effects.\$\endgroup\$CommentedFeb 21 at 17:07
  • 1
    \$\begingroup\$Oh, it seems that I hit upon the same method as you in my answer, albeit less neatly. One thing I did differently is to clean up the unoverwritten empty strings if the generator throws.\$\endgroup\$CommentedFeb 22 at 17:27
5
\$\begingroup\$

The initial function can be optimized simply:

  1. Immediately bail out if no work is to be done.
  2. .reserve() the target size.
  3. std::move() the strings into the container.
  4. Reverse container, insert at the end, reverse container.
void foo(std::vector<std::string>& vec, size_t target, std::mt19937& rng) { if (vec.size() >= target) return; std::uniform_int_distribution dist; vec.reserve(target); std::reverse(vec.begin(), vec.end()); while(vec.size() < target) { auto x = std::to_string(dist(rng)); std::cout << x << '\n'; vec.emplace_back(std::move(x)); } std::reverse(vec.begin(), vec.end()); } 

I leave fixing the exception behavior (the final reverse) to the interested reader.

\$\endgroup\$
5
  • \$\begingroup\$You fell into the same trap I did: OP's code actually reverses the order of the elements inserted at the start of the original vector, whereas your solution with std::rotate() does not.\$\endgroup\$CommentedFeb 21 at 22:38
  • \$\begingroup\$@G.Sliepen Better two reverses.\$\endgroup\$CommentedFeb 21 at 22:42
  • \$\begingroup\$Yes. But that causes each element to be moved one more time than necessary.\$\endgroup\$CommentedFeb 22 at 9:28
  • 1
    \$\begingroup\$Well, this solution swaps each two original elements twice, and each two new elements once, for old+new/2 swaps and new move-constructions. Add old move-constructions for reallocation if needed. Yours moves each original element and each new element once, for old+new moves and new copy constructions, regardless of reallocation. Without reallocation I don't think it is actually worse, with reallocation depends... A move is generally more expensive than a straight swap, or move-construction.\$\endgroup\$CommentedFeb 22 at 16:02
  • \$\begingroup\$Your comment on the cost is very helpful, because it isn't obvious that two reverses aren't expensive, especially that moves are more (not less!) expensive then swaps.\$\endgroup\$
    – 友人A
    CommentedFeb 24 at 3:25
5
\$\begingroup\$

Both functions have poor names. Both assume some prelude, which should have been included in the review:

#include <iostream> #include <random> #include <string> #include <vector> using std::size_t; 

Personally, I'd remove that using, and just write the type in full, like the other standard library types.


Both functions are pretty inflexible about the exact types of container and random number generator. That's probably okay for now, until there's a proven need in the application to use other types.


foo() is somewhat inefficient because each insert() moves every existing element (even though a good compiler can deduce that can be implemented with the equivalent of std::memmove(), we can do it a single time as bar() shows). What makes this worse is that although we know the eventual size we'll reach, we don't reserve() before we start.

bar() is inefficient because we need to allocate memory for xs and have both vectors occupying memory simultaneously.


bar()is not equivalent to foo() because if any of the output operations throws, then the vector does not contain the results of the successful invocations whose side-effects have occurred. To fix this, we need a catch that inserts the partial results:

 vec.reserve(target); // so that vec.insert() doesn't fail std::vector<std::string> xs; xs.reserve(n); try { std::generate_n(std::back_inserter(xs), n, [&dist, &rng]{ auto const x = std::to_string(dist(rng)); std::cout << x << '\n'; return x; }); } catch (...) { vec.insert(vec.begin(), std::make_move_iterator(xs.rbegin()), std::make_move_iterator(xs.rend())); throw; } vec.insert(vec.begin(), std::make_move_iterator(xs.rbegin()), std::make_move_iterator(xs.rend())); 

If performance is important, then a better implementation would insert a bunch of empty strings at the start, shifting up the elements just once (this is likely more efficient than reversing the vector, as the string objects can be memmoved as one), and overwrite the empty strings. In the event of an exception, we'll need to move the content back appropriately.

That would look like this:

#include <iterator> void prepend_randoms(std::vector<std::string>& vec, std::size_t target, std::mt19937& rng) { if (vec.size() >= target) { // nothing to do return; } auto dist = std::uniform_int_distribution{}; // Add empty elements to start vec.reserve(target); auto const n = target - vec.size(); vec.insert(vec.begin(), n, {}); for (auto it = vec.rend() - n; it != vec.rend(); ++it) { try { auto const x = std::to_string(dist(rng)); std::cout << x << '\n'; *it = x; } catch (...) { vec.erase(vec.begin(), it.base()); throw; } } } 

This is still not as simple and readable as the original, but it minimises computational complexity, requires no extra storage, and retains the same semantics even in the face of exceptions.

\$\endgroup\$
2
  • 1
    \$\begingroup\$The exception issue is very true and skipped my mind, I definitely have to revisit it. I question whether reserve is correct though. You are overriding the std::vector reallocating algorithm even though the library has all the information it needs upon reallocation, i.e. it knows that insert needs space for n more elements. Typically reserve makes sense only when the library doesn't have the information, e.g. repeated push_back.\$\endgroup\$
    – 友人A
    CommentedFeb 24 at 3:14
  • \$\begingroup\$@友人A, good point about reserve() - it's helpful in foo() but redundant in my final code. Thanks for pointing that out.\$\endgroup\$CommentedFeb 24 at 7:20
4
\$\begingroup\$

I don't think you can get much better than bar. What you can do is split it apart, so that at the top level you have meaningful names.

std::vector<std::string> generate_random_strings(size_t count, std::mt19937& rng) { std::vector<std::string> result; result.reserve(count); std::generate_n(std::back_inserter(result), count, [&]{ return std::to_string(dist(rng)); }); return result; } void print_strings(const std::vector<std::string>& strings) { for (auto& str : strings) { std::cout << str << '\n'; } } void prepend_reversed(std::vector<std::string>& to, std::vector<std::string> from) { to.insert(to.begin(), std::make_move_iterator(from.rbegin()), std::make_move_iterator(from.rend())); } void bar(std::vector<std::string>& vec, size_t target, std::mt19937& rng) { if(vec.size() >= target) return; auto xs = generate_random_strings(target - vec.size(), rng); print_strings(xs); prepend_reversed(vec, std::move(xs)); } 
\$\endgroup\$
2
  • \$\begingroup\$I think you can do better than bar() - see my answer for avoiding creation of the vector of results, and for inserting (only) the successful ones when printing (or string creation) throws.\$\endgroup\$CommentedFeb 22 at 17:20
  • \$\begingroup\$@TobySpeight better by the posed criteria, which includes being understood by the author of the initial version\$\endgroup\$
    – Caleth
    CommentedFeb 23 at 15:15

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.