Looks good.
Couple of things I would do differently (not that your way is wrong).
Rather than pass references to the containers around I would pass iterators into the containers. This allows your sort algorithm to be container agnostic:
void merge_sort(std::vector<int>& numbers) {} // My version looks like this template<typename I> // Notice the template void merge_sort(I begin, I end) // Just means I don't care what type {} // of iterator is used.
Same applies to merge()
.
Rather than creating sub arrays in merge_sort()
I would do it inside merge()
. With your current implementation you have a space complexity of \$O(N^2)\$. If you do it inside the merge
you just need to allocate enough space to merge the current two ranges which is at most \$O(2N)\$ => \$O(N)\$.
Where you have:
std::vector<int>::size_type middle = numbers.size() / 2; std::vector<int> left(numbers.begin(), numbers.begin() + middle); std::vector<int> right(numbers.begin() + middle, numbers.end()); merge_sort(left); merge_sort(right); // My version looks like this: std::size_t mid = length/2; I midPoint = std::next(begin, mid); // Merge in place. mergeSort(begin, midPoint); mergeSort(midPoint, end);
I could not work out how to merge without using a temporary (and I don;t have my copy of Knuth here). So my version of merge()
merges the two sorted sub vectors into a temporary then copies back over the original.
Looking at your merge code it's slightly hard to follow (but I groked it). I personally prefer a simpler version.
// In this loop: // l: current position in left sub-array // r: current position in right sub-array // i: current position into merged array. // Note because we are merging in-place. // begin/midPoint/end are iterators to the input arrays that // split it into two parts. while(l < midPoint && r < end) { if (*l < *r) { *i = *l; ++i; ++l; } else { *i = *r; ++i; ++r; } } // One of the ranges is empty at this point. // So only one of the loops will execute. while(l < midPoint) { *i = *l; ++i; ++l; } while(r < end) { *i = *r; ++i; ++r; }
A slight variation on this that I use; Where you use if () {} else {}
I prefer to use the Condition Operator
=>Test ? <TrueWork> : <FalseWork>
. I also have done this a few times and can safely compress ++
operations onto the same lines (Note I don't compress all of them; this is just personal preferences as I think it makes it easier to read this way). Which leaves me with:
while(l < midPoint && r < end) { *i = std::move((*l < *r) ? *l++ : *r++); ++i; } while(l < midPoint) { *i = std::move(*l++); ++i; } while(r < end) { *i = std::move(*r++); ++i; }
Notice: I use std::move()
here. This is because my sort works on generic containers (not just integer containers). So I may be sorting an array of strings.
Final result is:
#include <vector> #include <iostream> #include <algorithm> #include <iterator> template<typename I> void doMerge(I begin, I midPoint, I end) { typename std::vector<typename std::iterator_traits<I>::value_type> TmpVec; TmpVec tmp(std::make_move_iterator(begin), std::make_move_iterator(end)); TmpVec::iterator beginAlt = std::begin(tmp); TmpVec::iterator endAlt = std::end(tmp); TmpVec::iterator midAlt = std::next(beginAlt, std::distance(begin, midPoint)); TmpVec::iterator l = beginAlt TmpVec::iterator r = midAlt; I i = begin; while(l < midAlt && r < endAlt) { *i = std::move((*l < *r) ? *l++ : *r++); ++i; } while(l < midAlt) { *i = std::move(*l++); ++i; } while(r < endAlt) { *i = std::move(*r++); ++i; } } template<typename I> void mergeSort(I begin, I end) { std::size_t length = std::distance(begin, end); if (length <= 1) { return; } std::size_t mid = length/2; I midPoint = std::next(begin, mid); mergeSort(begin, midPoint); mergeSort(midPoint, end); doMerge(begin, midPoint, end); } int main() { std::vector<int> data {{ 5,12,45,2,67,8}}; mergeSort(std::begin(data), std::end(data)); std::copy(std::begin(data), std::end(data), std::ostream_iterator<int>(std::cout, ", ")); std::cout << "\n"; }