Leetcode two sum

Question

I'm currently learning c++ coming from a python background, so I'll include a solution in python and in c++ for the following problem statement:

Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target. You may assume that each input would have exactly one solution, and you may not use the same element twice. You can return the answer in any order.

Example 1:

Input: nums = [2,7,11,15], target = 9

Output: [0,1]

Example 2:

Input: nums = [3,2,4], target = 6

Output: [1,2]

I would like to hear your feedback / suggestions for performance improvements / other suggestions. Here's the link

two_sum.py

def two_sum(nums: list, target: int): for i, n in enumerate(nums): match = target - n if match in (rest := nums[i + 1:]): match_at = rest.index(match) return i, match_at + i + 1 if __name__ == '__main__': if result := two_sum([2, 7, 11, 15], 22): print(f'Indices:\n{result}') else: print('No matches found')

Leetcode stats:

Runtime: 772 ms, faster than 36.98% of Python online submissions for Two Sum. Memory Usage: 14.4 MB, less than 49.82% of Python online submissions for Two Sum.

two_sum.h

#ifndef LEETCODE_TWO_SUM_H #define LEETCODE_TWO_SUM_H #include <iostream> #include <vector> using std::vector; using std::cout; using std::endl; vector<int> two_sum_solution(vector<int> &nums, int target) { vector <int> results; for (int i = 0; i < nums.size(); ++i) { int match = target - nums[i]; for (int j = i + 1; j < nums.size(); ++j) { if (nums[j] == match) { for (int index_match : { i, j }) results.push_back(index_match); } } } return results; } #endif //LEETCODE_TWO_SUM_H

main.cpp

#include <vector> #include "two_sum.h" using std::vector; int main() { vector<int> v1{2, 7, 11, 15}; vector<int> v = two_sum_solution(v1, 22); if (!v.empty()) { cout << "Indices:" << endl; for (auto i: v) cout << i << " "; } else (cout << "No matches found"); }

Leetcode stats:

Runtime: 384 ms, faster than 34.03% of C++ online submissions for Two Sum. Memory Usage: 9.3 MB, less than 12.99% of C++ online submissions for Two Sum.

Marc · Accepted Answer · 2020-11-05 05:17:04Z

11

I am not an expert in C++ but I can give a feedback about the Python solution.

Your current solution runs in \$O(n^2)\$. Basically, for each number n of the input nums, find target - n in nums. How to improved it?

The second part of the algorithm can be improved from \$O(n)\$ to \$O(1)\$. Instead of looking up target - n in a list, you can use a dictionary:

def two_sum(nums: list, target: int): num_index = {} for i, n in enumerate(nums): match = target - n if match in num_index: return num_index[match], i num_index[n] = i return -1

Results:

Original: Runtime: 772 ms. Memory Usage: 14.4 MB Improved: Runtime: 48 ms. Memory Usage: 15.5 MB

edited Nov 5, 2020 at 5:17

answered Nov 1, 2020 at 7:07

Marc

5,6842 gold badges14 silver badges35 bronze badges

\$\begingroup\$If performance is of importance, I would suggest using get() to avoid duplicate dict-lookup. That is, replace match = target - n with match = num_index.get(target - n) , and then replace the if-statement with if match is not None: , and finally, return match, i. I believe there might be some gain here, if the nums param is a rather long list. Please let me know if you disagree or prove me wrong. Cheers.\$\endgroup\$
– Wololo
CommentedNov 2, 2020 at 12:58
\$\begingroup\$@magnus I think that with your suggestion there should be a little improvement but LeetCode gives similar results. Maybe with a proper benchmark would be possible to notice the difference.\$\endgroup\$
– Marc
CommentedNov 2, 2020 at 14:28
\$\begingroup\$@magnus So you think making every check slower just to avoid one lookup will make it overall faster?\$\endgroup\$
– superb rain
CommentedNov 5, 2020 at 8:34
\$\begingroup\$@superbrain Yes, I believed so, at least for (very) large dicts. However, according to this SO post, it is not: stackoverflow.com/questions/39582115/… . My assumption was based on the accepted answer of this CR post: codereview.stackexchange.com/questions/231409\$\endgroup\$
– Wololo
CommentedNov 5, 2020 at 9:17
\$\begingroup\$@magnus Hmm, you said it's to avoid duplicate lookup, though. So the "for large list/dict" doesn't make sense, as the larger they get, the less that single avoided duplicate lookup matters.\$\endgroup\$
– superb rain
CommentedNov 5, 2020 at 11:42

| Show 4 more comments

Rish · Accepted Answer · 2020-11-01 07:14:30Z

Only include the header files that you need

In your two_sum.h file, you don't need iostream, since you're not using any of its functionality. Remember that #include literally copy-pastes the file, so if you're including this header file in multiple files, it might potentially slow down your compilation times.

Split declarations and definitions

Typically, you would split your files into two parts: the header file (normally ending with *.h, *.hpp, *.hh) and the source file (normally ending with *.cpp, *.cc). The header file only consists of the declarations and the source file contains the implementation.

So in your case, your header file will look like this:

two_sum.h

#ifndef LEETCODE_TWO_SUM_H #define LEETCODE_TWO_SUM_H #include <vector> std::vector<int> two_sum_solution(std::vector<int> &nums, int target); #endif // LEETCODE_TWO_SUM_H

and your source file will look like this:

two_sum.cpp

#include "two_sum.h" std::vector<int> two_sum_solution(std::vector<int> &nums, int target) { ... }

In fact, if you try to include your two_sum.h (with the implementation) into multiple files, you would be breaking the One-Definition Rule. Your source files would contain multiple definitions of the same function, and the linker will spit out an error. One way to get around is to mark the functions inline, but you most likely want to do the former.

No `using namespace` in the header files

Don't do using namespace or any of its variant in a header file. Since the header file is copy pasted throughout multiple source files, it has a potential to cause annoying errors. See here

Use const reference

Since two_sum_solution is not modifying the nums vector, pass it by const reference.

size_t vs int for array indices

Consider using size_t instead of int for array indices

Use `auto` as much as possible

There are a couple of instances in your code where you can use auto instead of specifying the type. Examples:

auto match = target - nums[i];auto v = two_sum_solution(v1, 22);

The inner-most loop is pointless

Simply do

results.push_back(i); results.push_back(j);

Also, once you've found the solution, you might want to return the result immediately.

Thanks, that's very informative, i have a few questions: what if i'm going to include the header file that will only contain declarations, in which .cpp file should the definition be given that i need to include it in several files? And regarding using const, auto and size_t whenever relevant, does this improve performance? I'm asking because as you must know, in python these things do not exist so i'm learning the hows and whys — watch-this, CommentedNov 1, 2020 at 7:58
Your header file should be included in all the source files that will want to your functions (for e.g. your main.cpp will include two_sum.h). Your functions should be defined in a separate source file (two_sum.cpp). During compilation, the linker will resolve all references. They don't improve performance. Since you're passing by reference, it makes sense to use const so you don't do anything that will accidentally modify the vector. auto is just a directive for the compiler to deduce the type, and size_t is the canonical type for array indices. — Rish, CommentedNov 1, 2020 at 8:44

hjpotter92 · Accepted Answer · 2020-11-01 07:01:54Z

You can perhaps improve the performance by creating a map of value -> index in the first iteration over the given array.

Currently, Your program does the following (time complexity):

iterate over all index, value pairs of the array (\$ O(n) \$)
search for target - value in the array (\$ O(n) \$)
lookup index of target - value (\$ O(n) \$)

And since these are all nested, you get to \$ O(n^2) \$ (it isn't \$ n^3 \$ because last lookup is not being done for each iteration).

My proposed solution:

Create a map/dict of {value: index} (\$ O(n) \$)
Iterate over index, value of array (\$ O(n) \$)
Lookup and return index from the map/dict (\$ O(1) \$)

def two_sum(numbers: list[int], target: int): lookup: dict = { value: index for index, value in enumerate(numbers) } for index, value in enumerate(numbers): match = target - value if search_index := lookup.get(match): return index, search_index return None

That would fail the cases where the target is double the number ex: [1,3,4,2] and 6, will output (1, 1) which is incorrect — watch-this, CommentedNov 1, 2020 at 7:10
Adding another if-clause to verify search_index != index is still faster than iterating over the array again. — hjpotter92, CommentedNov 1, 2020 at 7:12
@HeapOverflow I was referring to the iteration in OP's code: match in (rest := nums[i + 1:]) or int j = i + 1; j < nums.size(); ++j — hjpotter92, CommentedNov 2, 2020 at 10:41
@HeapOverflow Marc's rewrite is obviously better. Although it uses the same style as my proposed solution above. Mine was a verbatim iteration of the 3 points. — hjpotter92, CommentedNov 2, 2020 at 13:33
You can combine the two loops into one, which has the additional advantage of terminating as soon as the second item is found. — Mad Physicist, CommentedNov 2, 2020 at 20:48

user1717828 · Accepted Answer · 2020-11-01 18:30:07Z

This is interesting to me because I come from a C background and started using Python the past few years for work, so I've had the reverse path as you. When I started Python, I greatly preferred solutions like yours because looping through lists is so explicit and clear.

However, I since learned that more proficient Python programmers at work understand my code better when I use the standard library. Once I began to invest in learning those tools, it had the double effect of 1) making my code more succinct and 2) being more efficient in time and/or space.

In this case, I would solve the problem with combinations from the itertools package:

from itertools import combinations def two_sum(nums, target): pairs_with_indices = combinations(enumerate(nums), 2) # result is a generator comprehension. winning_pairs = ((index_i, index_j) for (index_i, i), (index_j, j) in pairs_with_indices if sum((i, j)) == target) # Insert as much error checking as you need... return next(winning_pairs)

There's probably an even better more succinct and clear solution using Numpy, which is effectively standard library in my line of work (data science) but that's not true everywhere.

One thing that's different than your code: there is no room for off-by-one-errors. In my experience, code like this

if match in (rest := nums[i + 1:]): match_at = rest.index(match) return i, match_at + i + 1

is easy for me to write, hard to read and maintainability spans the whole gambit from easy to impossible. In other words, managing indices manually in Python gives me just enough rope to hang myself with, and standard library functions have been a great alternative.

I tried the very same solution at first and despite itertools being very efficient, the solution for some reason is rejected "time limit exceeded" but it's what i think is the most readable solution but may not scale well because it's O(N^2) — watch-this, CommentedNov 1, 2020 at 18:01
@bullseye, Good point. I made the list comprehension a generator comprehension and only return the first value, so we don't continue calculating after we find something. Out of curiosity, could you share the input nums/target so I could try the time test? — user1717828, CommentedNov 1, 2020 at 18:31
The input nums are already provided on the leetcode platform, if you want you may test there directly. The examples were Input: nums = [2,7,11,15], target = 9 and Input: nums = [3,2,4], target = 6 — watch-this, CommentedNov 1, 2020 at 18:34
This adds unnecessary complexity to the problem. A dictionary would be a much more useful tool here. — Mad Physicist, CommentedNov 2, 2020 at 20:47

Toby Speight · Accepted Answer · 2020-11-04 17:09:04Z

Know your containers

std::unordered_map is your friend in this problem. Whenever you've never previously seen a number, simply use the operator[] or insert function to add the number and its index. When using find, it will return an iterator, which is a key-value pair.

eg: auto location = m.find(numToFind);

location->first is your key, and location->second is your value

When you return, don't use push_back

You can simply return an initializer list like: {i,j}.

Stack Exchange Network

Leetcode two sum

5 Answers 5

Only include the header files that you need

Split declarations and definitions

two_sum.h

two_sum.cpp

No `using namespace` in the header files

Use const reference

size_t vs int for array indices

Use `auto` as much as possible

The inner-most loop is pointless

Know your containers

When you return, don't use push_back

Linked

Hot Network Questions

Leetcode two sum

5 Answers 5

Only include the header files that you need

Split declarations and definitions

two_sum.h

two_sum.cpp

No using namespace in the header files

Use const reference

size_t vs int for array indices

Use auto as much as possible

The inner-most loop is pointless

Know your containers

When you return, don't use push_back

Linked

Related

Hot Network Questions

No `using namespace` in the header files

Use `auto` as much as possible