3
\$\begingroup\$

This is the "Maximum sub-array" problem from CodeChef:

Find out the maximum sub-array of non negative numbers from an array.

The sub-array should be continuous. That is, a sub-array created by choosing the second and fourth element and skipping the third element is invalid.

Maximum sub-array is defined in terms of the sum of the elements in the sub-array. Sub-array A is greater than sub-array B if sum(A) > sum(B).

NOTE 1: If there is a tie then compare the segment length's and return segment which has maximum length.

NOTE 2: If there is still a tie, then return the segment with minimum starting index.

This is my solution:

def maxset(array): """Returns maximum sub array of non-negative numbers""" max_sum = 0 current_sum = 0 start = 0 stop = 0 max_array = array[0] for i in range(len(array)): if array[i] >= 0: current_sum += array[i] stop += 1 if max_sum < current_sum: max_sum = current_sum if max_array < array[start:stop]: max_array = array[start:stop] elif max_sum == current_sum: if len(max_array)< len(array[start:stop]): max_array = array[start:stop] else : start = i+1 stop = start current_sum = 0 return max_array 

Test cases :

def main(): print maxset( [ 1, 2, 5, -7, 2, 5 ] ) print maxset( [ 1,-1, 1, -1, 1, -1]) print maxset( [ 1, 2, 5, 7, 2, 5 ] ) print maxset( [ 222]) print maxset( [ 1, 1]) print maxset( [ 3, -1, 1, 1, 1, -1, 3]) print maxset( [ 5, -1, 1, 1, 1, 1, -1, 5 ]) print maxset([ 3, -1, 1, 1, 1, -1, 2 ]) print maxset([ 3,-3, -1, 1, 1, 1, -1, 2 ]) if __name__ == '__main__': main() 

How can I improve this code ?

\$\endgroup\$
0

    2 Answers 2

    9
    \$\begingroup\$

    1. Bugs

    1. maxset returns the wrong answer for some inputs. For example:

      >>> maxset([2, 1, -1, 1, 3]) [2, 1] 

      The correct answer is [1, 3]. The cause of this bug is the line:

      if max_array < array[start:stop]: 

      which causes the sub-array with the higher sum (here, [1, 3]) to be ignored because it lexicographically precedes max_array (here, [2, 1]).

    2. maxset goes wrong when all the elements of the input are negative:

      >>> maxset([-1, -2]) -1 

      The number -1 is not a valid result: it's not the maximum sub-array, it's just a number.

      It also goes wrong when the input is empty:

      >>> maxset([]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "cr193902.py", line 7, in maxset max_array = array[0] IndexError: list index out of range 

      To fix these two problems, instead of this initialization:

      max_array = array[0] 

      start with an empty list:

      max_array = [] 

    2. Review

    1. The code in the post uses the print statement and so only runs on Python 2.7. But support for Python 2.7 is scheduled to end in 2020. It is long past the time when it was a good idea to start writing new projects in Python 2.7. If you really can't use Python 3 then use:

      from __future__ import print_function 

      so that your code is portable to Python 3.

    2. The name maxset is misleading since it does not return a set.

    3. When you take a slice of a list, for example array[start:stop], Python has to copy out all the elements of the slice in order to construct a new list. This is a waste unless you are actually going to need the new list. So instead of:

      len(array[start:stop]) 

      (which creates a list and then throws it away again) you could write:

      stop - start 
    4. This line copies out a slice of the list each time you find a new maximum sub-array:

      max_array = array[start:stop] 

      but in many cases you are never going to need this list, because later on you will find an even bigger sub-array. So instead of copying out a slice, you could remember the location of the maximum sub-array found so far:

      max_start = start max_stop = stop 

      and at the end of the function

      return array[max_start:max_stop] 
    5. The code has a loop over the indexes into array:

      for i in range(len(array)): 

      and then within the loop it looks up the element at that index using array[i]. Whenever you see this pattern, you can simplify it using the built-in function enumerate. In this case the loop becomes:

      for i, element in enumerate(array): 

      and wherever you had array[i] you now have element.

    6. The two branches of the if: ... elif: ... both have the line:

      max_array = array[start:stop] 

      This duplication could be avoided by combining the two conditions into one:

      if (max_sum < current_sum or (max_sum == current_sum and max_stop - max_start < stop - start)): 

      and this can be further simplified by comparing tuples:

      if (max_sum, max_stop - max_start) < (current_sum, stop - start): 
    7. There's no need to maintain a running variable stop since it always has the value i + 1.

    8. The test cases don't check the results. When I run it, I get the output:

      >>> main() [1, 2, 5] [1] [1, 2, 5, 7, 2, 5] [222] [1, 1] [1, 1, 1] [5] [1, 1, 1] [1, 1, 1] 

      Is this right? The usual way to write a test case is to raise an exception if the result is incorrect. There are various ways to do this. You could use the assert statement and write:

      assert maxset([ 1, 2, 5, -7, 2, 5 ]) == [1, 2, 5] 

      or you could use the unittest module from the Python standard library.

    3. Revised code

    def max_sub_array(array): """Return maximum sub-array of non-negative numbers.""" start = 0 # Start of current sub-array current_sum = 0 # Sum of current sub-array max_start = 0 # Start of maximum sub-array max_stop = 0 # Stop of maximum sub-array max_sum = 0 # Sum of maxium sub-array for stop, element in enumerate(array, start=1): if element >= 0: current_sum += element if (max_sum, max_stop - max_start) < (current_sum, stop - start): max_sum = current_sum max_start = start max_stop = stop else: start = stop current_sum = 0 return array[max_start:max_stop] import unittest class TestMaxSubArray(unittest.TestCase): CASES = [ # array, expected result ([], []), ([-1, -2], []), ([2, 1, -1, 1, 3], [1, 3]), ([1, 2, 5, -7, 2, 5], [1, 2, 5]), ([1,-1, 1, -1, 1, -1], [1]), ([1, 2, 5, 7, 2, 5] , [1, 2, 5, 7, 2, 5]), ([222], [222]), ([1, 1], [1, 1]), ([3, -1, 1, 1, 1, -1, 3], [1, 1, 1]), ([5, -1, 1, 1, 1, 1, -1, 5], [5]), ([3, -1, 1, 1, 1, -1, 2], [1, 1, 1]), ([3, -3, -1, 1, 1, 1, -1, 2], [1, 1, 1]), ] def test_max_sub_array(self): for array, expected in self.CASES: self.assertEqual(expected, max_sub_array(array)) 

    4. A different approach

    The single responsibility principle say that each piece of code should do just one thing. By splitting code into pieces with single responsibilities, the individual pieces become easier to understand, check and test.

    So for this problem we could split the code into two parts:

    1. find all the sub-arrays of non-negative numbers in array;
    2. find the maximum sub-array in the results of step 1.

    For part 1 the Python standard library has a useful function itertools.groupby for splitting an iterable into groups based on a key function. In this case the key function needs to be "is the element non-negative". Then for part 2 we can use the built-in max function, again using a key function.

    from itertools import groupby def max_sub_array(array): """Return maximum sub-array of non-negative numbers.""" sub_arrays = [list(group) for non_negative, group in groupby(array, lambda e: e >= 0) if non_negative] if sub_arrays: return max(sub_arrays, key=lambda a: (sum(a), len(a))) else: return [] 

    In Python 3.4 or later, max has a default keyword argument to handle the case where there are no items, and so we can combine this into a single expression:

    from itertools import groupby def max_sub_array(array): """Return maximum sub-array of non-negative numbers.""" return max((list(group) for non_negative, group in groupby(array, lambda e: e >= 0) if non_negative), key=lambda a: (sum(a), len(a)), default=[]) 
    \$\endgroup\$
    4
    • \$\begingroup\$You say that the left operator in if max_array < array[start:stop]: is a number, but unless I am mistaken, it is a list (the best subarray found so far). The (lexicographical?) comparison still makes no sense there.\$\endgroup\$
      – Martin R
      CommentedMay 8, 2018 at 11:04
    • \$\begingroup\$Solid answer. Every time I read one of your answers I find something I myself miss. One day I'll remember groupby...\$\endgroup\$
      – Peilonrayz
      CommentedMay 8, 2018 at 11:06
    • \$\begingroup\$@MartinR It starts off as a number, max_array = array[0]. But is changed to a list just after the check, max_array = array[start:stop].\$\endgroup\$
      – Peilonrayz
      CommentedMay 8, 2018 at 11:08
    • \$\begingroup\$@MartinR: The lexicographic comparison is in fact a bug — see §1.1 in my revised answer. Thanks for the correction.\$\endgroup\$CommentedMay 8, 2018 at 11:11
    6
    \$\begingroup\$

    Your code is splitting numbers by negatives whilst performing the max sum code. You can simplify this, if you move the splitting out of the function.

    def split_negatives(array): ret = tuple() for item in array: if item >= 0: ret += (item,) elif ret: yield ret ret = tuple() if ret: yield ret 

    After this you can just use max. Since you're using Python 2, there's no default keyword if sections is an empty generator, and so we would need to use a tryexcept to deal with this.

    def maxset(array): sections = split_negatives(array) try: return max(sections, key=lambda i: (sum(i), len(i))) except ValueError: return 0 
    \$\endgroup\$
    0

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.