8
\$\begingroup\$

Implementation of the vector class in C++ for learning purposes. Tried to implement interesting parts from the API shown on cppreference.com, but some elements like custom allocator support are missing.

template <typename T> class Vector { public: using value_type = T; using size_type = std::size_t; using difference_type = std::ptrdiff_t; using reference = value_type&; using const_reference = const value_type&; using pointer = value_type*; // Iterator support using iterator = pointer; using const_iterator = const pointer; Vector(): data{nullptr}, m_size{0}, m_capacity{0} {} explicit Vector(size_type capacity) : Vector() { resize(capacity); } Vector(size_type count, const_reference value): Vector() { resize(count, value); } template<typename InputIt> Vector(InputIt first, InputIt last) : Vector() { assign(first, last); } Vector(std::initializer_list<T> init) : Vector() { assign(init.begin(), init.end()); } Vector(const Vector& other): Vector() { assignRange(other.begin(), other.end()); } Vector& operator=(const Vector& other) { Vector temp = other; swap(temp); return *this; } Vector(Vector&& other) noexcept : Vector() { swap(other); } Vector& operator=(Vector&& other) noexcept { swap(other); return *this; } ~Vector() { clear(); deallocate(data); } size_type size() const noexcept { return m_size; } reference operator[](size_type index) { return data[index]; } void resize(size_type count, const_reference value = T()) { if (count < m_size) { for (size_type i = count; i < m_size; i++) { data[i].~T(); } } else if (count > m_size) { if (count > m_capacity) { reserve(count); } for (size_type i = m_size; i < count; i++) { new (data + i) T(value); } } m_size = count; } void reserve(size_type count) { if (m_capacity < count) { pointer newData = allocate(count); if (m_size > 0) { std::uninitialized_move(data, data + m_size, newData); } deallocate(data); data = newData; m_capacity = count; } } void reserve_if_needed() { if (m_size == m_capacity) { if (m_capacity == 0) { reserve(1); } else { reserve(m_capacity * 2); } } } void push_back(const_reference item) { reserve_if_needed(); data[m_size++] = item; } void pop_back() { // The standard says pop_back() on an empty vector is // undefined behavior, so this check is possibly unnecessary // since implementations can technically do whatever // in case of undefined behavior? if (m_size > 0) { data[m_size - 1].~T(); m_size -= 1; } } template<typename... Args> void emplace_back(Args&&... args) { reserve_if_needed(); new (data + m_size) T(std::forward<Args>(args)...); m_size++; } void shrink_to_fit() { if (m_capacity > m_size) { pointer new_data = allocate(m_size); if (m_size > 0) { std::uninitialized_move(data, data + m_size, new_data); } deallocate(data); data = new_data; m_capacity = m_size; } } void swap (Vector& other) noexcept { using std::swap; swap(data, other.data); swap(m_size, other.m_size); swap(m_capacity, other.m_capacity); } // Iterator support iterator begin() const { return data; } iterator end() const { return data + m_size; } iterator insert(iterator pos, const_reference item) { return insert(pos, 1, item); } iterator insert(iterator pos, size_type count, const_reference item) { size_type index = pos - data; size_type remaining = m_size - index; if (m_capacity < m_size + count) { reserve(m_size + count); } std::uninitialized_move(data + index, data + m_size, data + index + count); std::uninitialized_fill(data + index, data + index + count, item); m_size += count; return data + index; } iterator erase(iterator pos) { return erase(pos, pos + 1); } iterator erase(iterator first, iterator last) { size_type n_elements = last - first; size_type index = first - data; for (size_type i = index; i < index + n_elements; i++) { data[i].~T(); } std::move(data + index + n_elements, data + m_size, data + index); m_size -= n_elements; return first; } template<typename InputIt> void assign(InputIt first, InputIt last) { assignRange(first, last); } void clear() noexcept { for (std::size_t i = 0; i < m_size; i++) { data[i].~T(); } m_size = 0; } template<typename InputIt> void assignRange(InputIt first, InputIt last) { clear(); resize(std::distance(first, last)); std::uninitialized_copy(first, last, data); } void assign(size_type count, const_reference value) { Vector temp(count, value); swap(temp); } private: pointer data = nullptr; size_type m_size; size_type m_capacity; pointer allocate(size_type count) { return static_cast<pointer>(::operator new(count * sizeof(value_type))); } void deallocate(pointer p) { if (p != nullptr) { ::operator delete(p); } } }; 
\$\endgroup\$
5
  • \$\begingroup\$Something interesting - GCC uses 3 pointers for their vector implementation, so instead of (start, size, capacity), they have (start, end, buff_end)\$\endgroup\$CommentedJul 20, 2023 at 18:38
  • \$\begingroup\$I've removed the beginner tag, as implementing a vector is very much not beginner-level material. And I don't see any beginner mistakes - very much the opposite, with proper uninitialized-memory handling. The only thing lacking is a decent set of unit tests.\$\endgroup\$CommentedJul 20, 2023 at 20:11
  • \$\begingroup\$BTW, I strongly recommend running your unit tests under Valgrind+memcheck, or similar tool. That's an essential development step for any memory-management class.\$\endgroup\$CommentedJul 21, 2023 at 7:51
  • \$\begingroup\$Thanks @TobySpeight, will look into the best practices for writing C++ unit tests and give it a shot for this class!\$\endgroup\$
    – jdav22
    CommentedJul 21, 2023 at 18:35
  • \$\begingroup\$You have a few bugs relating to destroying an element .~T() and then acting as if it is a valid element on which you call assignment operations. That's a bug. You gotta to use placement new to construct the object instead after deletion. I'd rewrite places where it is uses unnecessarily. Also in assignment operation it is not a good practice to consistently allocate new memory, it is inefficient for such a basic-class. You should reuse current memory if possible.\$\endgroup\$
    – ALX23z
    CommentedJul 23, 2023 at 3:50

2 Answers 2

6
\$\begingroup\$

As vector is such a common thing we review here I have written up a series of articles about implementing a vector.

Overview

Very good.

I found one major bug in push_back(). You have potential memory leaks in reserve() and shrinktofit() that are easy to fix. You can simplify your assignment operator (currently you have copy and move versions) they can be combined into a single version that works for both.

Minor comments about some missing functionaity that is in std::vector.

Code review

Sure you can have empty vectors.

 Vector(): data{nullptr}, m_size{0}, m_capacity{0} {} 

But I am wondering if the best strategy is not to simply always allocate capacity for minimal size. How often is a vector allocated but not used?

But pointed out by @chrysante below, the std::vector default constructor is noexcept so can't have any memory allocation (as that can potentially throw). So if you want to go that route you can mark this default constructor noexcept.

 Vector() noexcept : data{nullptr} , m_size{0} , m_capacity{0} {} 

One comment on style. Like in variable declarations in the code blocks its nice to have member initialization in the constructor one per line (its easier to read). You are not trying to save vertical space.


Slight deviation from the interface of std::vector! Sure you can do it. But it will confuse people. Also I use it to simplify things below. I have the same constructor in my class but mine is private so it can only be used internally.

 explicit Vector(size_type capacity) : Vector() { resize(capacity); } 

You can simplify these two assignments into a single method:

 Vector& operator=(const Vector& other) { Vector temp = other; swap(temp); return *this; } Vector& operator=(Vector&& other) noexcept { swap(other); return *this; } 

Easy to write as:

 // Notice the parameter is value. // If passed a RValue it is moved into the parameter. // If passed an LValue it is copied. // So you get the same effect with less code. Vector& operator=(Vector other) noexcept { swap(other); return *this; } 

This is good:

 reference operator[](size_type index) { return data[index]; } 

But what about accesses to a const Vector? Just because you can't modify does not mean you can't use the operator[] on it.

 const_reference operator[](size_type index) const { return data[index]; } 

While we are here: Why is there no at() method?


 void resize(size_type count, const_reference value = T()) { // Not valid here ^^^^^^ // That should be in the header only 

This is fine. But std::vector destroys them from back to front. Just like how it deletes elements during destruction. This is to mimic the behavior of the C-style array (objects are destroyed in reverse order of creation).

 if (count < m_size) { for (size_type i = count; i < m_size; i++) { data[i].~T(); } } else if (count > m_size) { if (count > m_capacity) { reserve(count); } for (size_type i = m_size; i < count; i++) { new (data + i) T(value); } } m_size = count; } 

 void reserve(size_type count) { if (m_capacity < count) { pointer newData = allocate(count); // Note sure why you need to check against 0 here. // Copy zero data is a NOOP so would not be more expensive. // If I had to guess this is actually a pesimization as you // code has an extra branch. if (m_size > 0) { // A new function to me. // Thats really cool. std::uninitialized_move(data, data + m_size, newData); } deallocate(data); data = newData; m_capacity = count; } } 

The one thing here I would watch is that you have a potential leak.

It's hard to spot. But if the type T does not support move construction then the compiler will use the copy constructor during the std::uninitialized_move. If one of the copy constructors fails (i.e. throws) then you will leave this function needing to clean up newData. Though your function does provide the strong exception guarantee.

You can make this simpler by re-using the Vector :-)

 void reserve(size_type count) { if (m_capacity < count) { Vector temp(count) // Use your vector with pre-reserved size. // Remember that Vector is a friend of Vector // So you can reach into the other class and mess with // its members (just remember to unit test). std::uninitialized_move(data, data + m_size, temp.data); temp.m_size = m_size; swap(temp); } } 

This is broken. You are pushing into uninitialized memory so you need to construct the object in place.

 void push_back(const_reference item) { reserve_if_needed(); data[m_size++] = item; // Should be this. new (std::addressof(data[m_size++])) T(item); } 

What about pushing an RVALUE?

Replace the above with:

 // Note: Pass by value. // RVALUE are moved into item then moved to container. // LVALUE are copied into item then moved to the container. void push_back(T item) { reserve_if_needed(); new (std::addressof(data[m_size++])) T(std::move(item)); } 

 if (m_size > 0) { // Why not swap the next two lines? // This would make the line to call the destructor // simpler and easier to read as you don't need the -1 data[m_size - 1].~T(); m_size -= 1; } 

Same issue as reserve()

 void shrink_to_fit() { if (m_capacity > m_size) { pointer new_data = allocate(m_size); if (m_size > 0) { std::uninitialized_move(data, data + m_size, new_data); } deallocate(data); data = new_data; m_capacity = m_size; } } 

Same solution:

 void shrink_to_fit() { if (m_capacity > m_size) { Vector temp(m_size); // Vector with reserved capacity std::uninitialized_move(data, data + m_size, temp.data); temp.m_size = m_size; swap(temp); } } 

Sure standard iterators:

 // Iterator support iterator begin() const { return data; } iterator end() const { return data + m_size; } 

But what about const iterators or reverse iterators or const reverse iterators? Also when calling begin() on a const object you will get a const_iterator.

 iterator begin() { // normal begin const_iterator begin() const { // begin on const object const_iterator cbegin() const { // explicitly asking for const reverse_iterator rbegin() { // normal rbegin const_reverse_iterator rbegin() const { // rbegin on const object const_reverse_iterator crbegin() const { // explicitly asking for re 

etc


To be similar to C-style arrays (and the C++ standard idiom that objects are destroyed in reverse order of creation), you should destroy the members in reverse order.

 void clear() noexcept { for (std::size_t i = 0; i < m_size; i++) { data[i].~T(); } m_size = 0; } 

No need to check for null pointers here!

 void deallocate(pointer p) { if (p != nullptr) { ::operator delete(p); } } 

Questions:

  1. What is the slight deviation from the standard interface in the explicit Vector(size_type capacity)? Is it that the param is named 'capacity' and not 'count?

If you look at the standard (I link to a non standard but reputable source) std::vector::vector you will see there is no constructor that takes a "capacity" (ie. you have allocated space but zero size). There is one that takes a size and fills it with values (but you have that one).

  1. When you say "T()" should be in the header only, do you mean the template declaration?

No. Default Parameters should be defined in the class header file only. They are not part of the declaration:

class Vector { void resize(size_type count, const_reference value = T()); // This is good. // Reading the interface you see you can have a default value. }; // Don't define the default parameters here. // It should generate an error if you do this. void Vector::resize(size_type count, const_reference value) { // STUFF } 
  1. How does constructing the item in place in push_back() resolve the uninitialized memory issue?

This is a problem becasue:

 // This code uses the assignment operator. data[m_size++] = item; // The assignment operator assumes that the object // referenced on the left hand side has already been constructed // but in your case that is not true, this is unintialized memory. // So you are using assignment to uninitialized memory // which could be anything and thus with non trivial T // will cause an issue. 

This is solved by constructing in place:

 // Should be this. new (std::addressof(data[m_size++])) T(item); 

The constructor assumes the memory has not been constructed before. You pass the address to operator new it will not allocate space but simply use the pointer provided as the location and then call the constructor for the type T to correctly initialize the memory.

  1. do you have any references or documentation on the "pass by value" idiom for avoiding the two assignment operators?

Nope. There are lots of questions on Stackoverflow that go over this though. Should be simple to find.

  1. And does the SFINAE approach involve checking std::is_nothrow_move_constructible or something in overload resolution? I should look into how this is done in some compiler implementation

Yep.

\$\endgroup\$
14
  • 2
    \$\begingroup\$If you conform to the spec, you can't allocate in the default constructor. It's noexcept iff the allocator is noexcept default constructible and so is std::allocator. Allocating any buffer could potentially throw. explicit Vector(size_type capacity); as implemented is not a deviation from the standard, only the parameter is named confusingly (should be count or similar). void push_back(T item); you should not declare push_back this way. If T is not moveable, the copy constructor may be called twice when push_back is invoked and if T is expensive to copy this is a problem.\$\endgroup\$
    – chrysante
    CommentedJul 21, 2023 at 8:15
  • 1
    \$\begingroup\$begin() should return an iterator and begin() const should return a const_iterator. And the destruction order of std::vector is unspecified. Good review otherwise :-)\$\endgroup\$
    – chrysante
    CommentedJul 21, 2023 at 8:18
  • 1
    \$\begingroup\$About the "take parameters by value instead of const& and && overloads" idiom: In general I strongly agree with this, but in generic code you can't really do it because you never know how expensive a move or a copy of T really is and in certain situations the object will be moved/copied twice.\$\endgroup\$
    – chrysante
    CommentedJul 21, 2023 at 8:24
  • \$\begingroup\$@chrysante Interesting, this is the first I hear about potentially copying objects twice. In what situations does this happen? Oh, I see what you mean, if you don’t know what T is, you can’t assume it can be moved. Is that it?\$\endgroup\$CommentedJul 21, 2023 at 14:11
  • 1
    \$\begingroup\$@jdav22 Questions answered above. See end of text above.\$\endgroup\$CommentedJul 21, 2023 at 20:32
4
\$\begingroup\$

Overall you did a great job already, implementing containers is a very non trivial task. The other review already points out many things, I will add what I think is still missing.


In the list of typedefs you forgot the following:

using const_pointer = value_type const*; using reverse_iterator = std::reverse_iterator<iterator>; using const_reverse_iterator = std::reverse_iterator<const_iterator>; 

And note that a const_iterator is not a const iterator but rather an iterator to const. The way you declared it, it is simply an iterator that is const but you want it to be an iterator that iterates over a range of constant elements, so you could declare it this way:

using const_iterator = value_type const*; 

This constructor is fine, but you might want to change the parameter name to count or something similar.

explicit Vector(size_type count) : Vector() { resize(count); } 

template<typename InputIt> Vector(InputIt first, InputIt last) : Vector() { // (*) assign(first, last); } 

This constructor is a bit tricky. Cppreference says this this about it:

This overload participates in overload resolution only if InputIt satisfies LegacyInputIterator, to avoid ambiguity with the overload (3).

Where "overload (3)" is Vector(size_type count, const T& value). You need to use SFINAE or a C++20 requires clause to restrain the argument types. If you don't do this, this will happen:

Vector<int> v(10, 1); // We want to construct a Vector holding 10 ints with value 1 // But overload (*) will be selected because // it's a better match and compilation will // fail (if you're lucky) 

The pre C++20 iterator concept is tricky to implement, especially with SFINAE. A preliminary solution could be to implement the pre C++11 solution and to require InputIt to not be integral using SFINAE:

template<typename InputIt, typename std::enable_if<!std::is_integral<T>::value, int>::type = 0> Vector(InputIt first, InputIt last) : Vector() { assign(first, last); } 

or using a requires clause:

template<typename InputIt> requires (!std::integral<InputIt>) Vector(InputIt first, InputIt last) : Vector() { assign(first, last); } 

In the resize method you don't need the if (count < m_size) and if (count > m_size) checks. Simply do this:

void resize(size_type count, const_reference value = T()) { for (size_type i = count; i < m_size; i++) { data[i].~T(); } if (count > m_capacity) { reserve(count); } for (size_type i = m_size; i < count; i++) { new (data + i) T(value); } m_size = count; } 

The for loops already perform both checks so they are redundant.


In the reserve and the shrink_to_fit method you need to call the destructor of the elements in the old buffer after calling std::uninitialized_move Also consider using guard clauses to have shallower indentation.

void reserve(size_type new_cap) { if (m_capacity >= new_cap { return; } pointer newData = allocate(new_cap); std::uninitialized_move(data, data + m_size, newData); std::destroy(data, data + m_size); deallocate(data); data = newData; m_capacity = new_cap; } 

If you don't want to traverse the elements twice write a for loop and move and destroy manually.

Note that a conforming implementation of shrink_to_fit can be a no-op:

void shrink_to_fit() {} 

push_back looks fine but to add an overload for r-values you can do this to make use of perfect forwarding:

void push_back(value_type const& item) { push_back_impl(item); } void push_back(value_type&& item) { push_back_impl(std::move(item)); } template <typename U> void push_back_impl(U&& item) { reserve_if_needed(); ::new (&data[m_size++]) value_type(std::forward<U>(item)); } 

Note that you could do this:

void push_back(value_type item) { reserve_if_needed(); ::new (&data[m_size++]) value_type(std::move(item)); } 

Now the implementation is very simple, but it can perform two moves/copies if invoked like this:

Vector<MyType> v; MyType t; v.push_back(t); 

t will be copied into push_back's argument and then copied or moved into the vector. Depending on how expensive a move/copy of value_type is, this can be fine. Unfortunately we don't know what value_type is going to be, so this is suboptimal.

But you don't need any SFINAE tricks here as suggested by the other answer.

Also I would like to add, that in any other (non-template) context, I would always do what the other answer suggested and just accept the potential double-move. But in a template context you can't do it because you have to consider all possible potentially rogue types as value_type.


Your implementation of pop_back is fine if you don't mind the added overhead of the check for empty().


Your insert method has a bug. If index + count is less than m_size, std::uninitialized_move will construct new objects over existing ones. I can add a working implementation later, but this is tricky to implement. Make sure to add unit tests that check if all elements that have been constructed also get destroyed.


Make all your helper methods private.


Rename the buffer pointer to m_data. The name data is reserved for the .data() method:

pointer data() { return m_data; } const_pointer data() const { return m_data; } 

I think there are more issues, I will edit this later.

\$\endgroup\$
4
  • \$\begingroup\$Thanks, super helpful! Wish I could accept multiple answers, but have upvoted yours.\$\endgroup\$
    – jdav22
    CommentedJul 23, 2023 at 22:44
  • 1
    \$\begingroup\$If I understand correctly, you would not overload on lvalue vs rvalue reference and use the by-value operator if you knew the type and knew that a move was not too expensive, but in a templated container you choose to overload because the type could be non-movable or expensive to move?\$\endgroup\$
    – jdav22
    CommentedJul 23, 2023 at 22:45
  • 1
    \$\begingroup\$@jdav22 Yes, exactly\$\endgroup\$
    – chrysante
    CommentedJul 24, 2023 at 15:47
  • \$\begingroup\$Sorry to revive an old thread, but just noticed the comment about the bug in insert(). As I understand it, the only issue here is not all elements in the vector get destroyed, correct? I'm not sure how to write a test to determine whether an element has been destroyed or not, do you have any suggestions? Also, to implement this correctly and destroy all the elements, the only strategy I can think of is to copy the overlapping elements into some other container, destroy what's in the overlap, then copy back, but there must be something more efficient. Any thoughts? Thanks! @chrysante\$\endgroup\$
    – jdav22
    CommentedAug 23, 2023 at 23:58

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.