6
\$\begingroup\$

I just had this question in an interview. I had to write some code to find all the common elements in two arrays. This is the code I wrote. I could only think of a 2-loop solution, but something tells me there must be a way to accomplish this with only 1 loop. Any ideas?

public List<Integer> findCommonElements(int[] arr1, int[] arr2) { List<Integer> commonElements = new ArrayList<>(); for (int i = 0; i < arr1.length; i++) { for (int j = 0; j < arr2.length; j++) { if (arr1[i] == arr2[j]) { commonElements.add(arr1[i]); break; } } } return commonElements; } 
\$\endgroup\$

    3 Answers 3

    4
    \$\begingroup\$

    You use a hashset, this means you have 2 loops, but not nested. You have a boolean hashset in this case where all values start at false of size k where k (this is typically what is used in Big O'notation) is the the number of possible integer values. You loop over your first array and for each value you go hashset[firstArray[i]] = true; once you have done this you loop over your second array, going if(hashset[secondArray[i]]) commonElements.add(secondArray[i]);.

    This is O(2n) which then becomes simply O(n) due to getting rid of the constants, your solution was O(n^2). Although it should be noted the storage required for using a hashset is considerably more.

    \$\endgroup\$
    2
    • \$\begingroup\$In this case, the size of the boolean array should be the maximum element for both arrays. You should have mentioned in your post, that in order to determine the k value, you would have to make additional loops. But something that really concerns me about this solution is that the maximum value could be 2 ^ 31 which results in wasting lots of memory.\$\endgroup\$
      – nullbyte
      CommentedMar 14, 2018 at 1:07
    • \$\begingroup\$@nullbyte You wouldn't need a second loop to determine k since you can simply set k to the max integer size in this case (2.147E9 4sf). You'r description of k is misleading, since you wouldn't set it to the max value but simply the number of different possible values in the input arrays. To properly calculate k in any circumstance you would need to use a hashset. The alternative solution that uses less memory is a linear search combined with a binary search which is O(nlog(n)). For the specific task I gave the best solution, there is no universally best applicable solution.\$\endgroup\$CommentedMar 15, 2018 at 16:58
    5
    \$\begingroup\$

    There is a reason why "Data Structures and Algorithms" has the data structures part added in, especially first. The reason why is data structures should be the first thing you think about even before writing any kind of algorithm.

    Lets take the specification of the code that you are given:

    Write some code to find all the common elements in two arrays.

    Now first thing is first this isn't specific enough, what do you mean by "common elements", does this include repeating elements so for example [1,1,2] and [1,1,3] would that be [1,1] or just [1]? For this I'm going to assume you mean elements non repeating.

    Now after we have established the specification which is

    Given two arrays of numbers find the common unique elements.

    I'd say these arrays are sounding a hell of a lot like a set data structure, and this set data structure, because we want to find the intersection between these two sets. In java this would be:

    public static void main(String[] args) { List<Integer> alist = Arrays.asList(new Integer[] {1,1,2}); List<Integer> blist = Arrays.asList(new Integer[] {1,1,3}); HashSet<Integer> aset = new HashSet<>(alist); aset.retainAll(blist); System.out.println(aset); } 

    It's very important to consider two things when someone gives you a spec, first think through is it clear enough, and secondly, after you have got an idea of what they want you want to strongly consider the data structure as these are structures that have been tried and tested to be efficient at specific tasks.

    In the program above, let A be the size of alist and B be the size of blist the time complexity is O(A + B) as it's O(1) to add an element to a hashset which we do for each element in A, then we loop though all elements in B because of the retainAll function needs to do a contains operation on each element, and that is O(1) for a HashSet. This is much more efficient than O(AB) ~ or O(n^2).

    edit:

    Thanks to nullbyte for pointing out that its more efficient to make one hashset and do ratainAll for just that single set. The code above has been adjusted.

    \$\endgroup\$
    7
    • \$\begingroup\$Where do you see the benefit of creating another hash set? You can just traverse the second list without allocating additional memory (as in the solution below). In your case, you allocate memory for the second hash set, traverse the second list in order to populate the second hash set, and then you call the retainAll() method which also requires O(n) time.\$\endgroup\$
      – nullbyte
      CommentedMar 14, 2018 at 1:27
    • \$\begingroup\$It depends on what you are optimising for, space and time have a give and take relationship. I'm sacrificing space for more time.\$\endgroup\$
      – James
      CommentedMar 14, 2018 at 1:29
    • \$\begingroup\$How can you get more time? In order to create a set, it takes O(n). Then you call the retainAll() method which takes O(n) as well. How do you get more time? From what I can see, it takes O(3 x n), doesn't it? But you can easily optimize it to O(2 x n) without creating the second hash set.\$\endgroup\$
      – nullbyte
      CommentedMar 14, 2018 at 1:34
    • \$\begingroup\$Lets pretend your list is now [1,1,1,1,1,1,...x10000000,2] you need to loop through all of the values in that list, to check to see if it is in the set. Wasted computation.\$\endgroup\$
      – James
      CommentedMar 14, 2018 at 1:35
    • \$\begingroup\$Sorry it's 1:40am here I'm a bit potato when it comes to my brain right now. So yes I think it would make sense to not make the second set and just feed it into the retainAll\$\endgroup\$
      – James
      CommentedMar 14, 2018 at 1:39
    2
    \$\begingroup\$

    So, the point of this solution is to put the first array into a hash set, and then traverse the second array checking if an element from the second array is present in the set.

    This solution requires O(n) time and O(n) space if the arrays have the same length.

     public static List<Integer> findCommon(int[] a, int[] b) { final Set<Integer> set = new HashSet<>(Arrays.stream(a).boxed().collect(Collectors.toList())); final List<Integer> result = new LinkedList<>(); for (int element : b) { if (set.contains(element)) { result.add(element); } } return result; } 

    Or a bit more concise solution:

     public static List<Integer> findCommon(int[] a, int[] b) { final Set<Integer> set = new HashSet<>(Arrays.stream(a).boxed().collect(Collectors.toList())); set.retainAll(Arrays.stream(b).boxed().collect(Collectors.toList())); return new ArrayList<>(set); } 
    \$\endgroup\$
    1

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.