2
\$\begingroup\$

Problem

Given a list and an integer chunk size, divide the list into sublists, where each sublist is of chunk size.

Codes

Just for practicing, I've coded to solve the problem using JavaScript and Python. There is also a built-in PHP function for that. If you'd like to review the codes and provide any change/improvement recommendations, please do so and I'd really appreciate that.

Python

""" # Problem # Given a list and chunk size, divide the list into sublists # where each sublist is of chunk size # --- Examples: (Input List, Chunk Size) => Output List # ([1, 2, 3, 4], 2) => [[ 1, 2], [3, 4]] # ([1, 2, 3, 4, 5], 2) => [[ 1, 2], [3, 4], [5]] # ([1, 2, 3, 4, 5, 6, 7, 8], 3) => [[ 1, 2, 3], [4, 5, 6], [7, 8]] # ([1, 2, 3, 4, 5], 4) => [[ 1, 2, 3, 4], [5]] # ([1, 2, 3, 4, 5], 10) => [[ 1, 2, 3, 4, 5]] """ from typing import TypeVar, List import math T = TypeVar('T') def chunk_with_slice(input_list: List['T'], chunk_size: int) -> List['T']: if chunk_size <= 0 or not isinstance(input_list, list): return False start_index = 0 end_index = chunk_size iteration = len(input_list) // chunk_size chunked_list = input_list[start_index:end_index] output_list = [] output_list.append(chunked_list) if (len(input_list) % chunk_size != 0): iteration += 1 for i in range(1, iteration): start_index, end_index = end_index, chunk_size * (i + 1) chunked_list = input_list[start_index:end_index] output_list.append(chunked_list) return output_list def chunk_slice_push(input_list: List['T'], chunk_size: int) -> List['T']: if chunk_size <= 0 or not isinstance(input_list, list): return False output_list = [] start_index = 0 while start_index < len(input_list): chunked_list = input_list[start_index: int(start_index + chunk_size)] output_list.append(chunked_list) start_index += chunk_size return output_list if __name__ == '__main__': # ---------------------------- TEST --------------------------- DIVIDER_DASH = '-' * 50 GREEN_APPLE = '\U0001F34F' RED_APPLE = '\U0001F34E' test_input_lists = [ [10, 2, 3, 40, 5, 6, 7, 88, 9, 10], [-1, -22, 34], [-1.3, 2.54, math.pi, 4, '5'], [math.pi, 2, math.pi, 4, 5, 'foo', 7, 8, 'bar', math.tau, math.inf, 12, math.e] ] test_expected_outputs = [ [ [10, 2], [3, 40], [5, 6], [7, 88], [9, 10] ], [ [-1], [-22], [34] ], [ [-1.3, 2.54, math.pi], [4, '5'] ], [ [math.pi, 2, math.pi, 4, 5], ['foo', 7, 8, 'bar', math.tau], [math.inf, 12, math.e] ] ] test_chunk_sizes = [2, 1, 3, 5] count = 0 for test_input_list in test_input_lists: test_output_form_chunk_slice = chunk_with_slice( test_input_list, test_chunk_sizes[count]) test_output_form_chunk_slice_push = chunk_slice_push( test_input_list, test_chunk_sizes[count]) print(DIVIDER_DASH) if test_expected_outputs[count] == test_output_form_chunk_slice and test_expected_outputs[count] == test_output_form_chunk_slice_push: print(f'{GREEN_APPLE} Test {int(count + 1)} was successful.') else: print(f'{RED_APPLE} Test {int(count + 1)} was successful.') count += 1 

JavaScript

function chunk_with_push(input_list, chunk_size) { const output_list = []; for (let element of input_list) { const chunked_list = output_list[output_list.length - 1]; if (!chunked_list || chunked_list.length === chunk_size) { output_list.push([element]); } else { chunked_list.push(element); } } return output_list; } function chunk_with_slice(input_list, chunk_size) { var start_index, end_index, chunked_list; const output_list = []; start_index = 0; end_index = chunk_size; chunked_list = input_list.slice(start_index, end_index); output_list.push(chunked_list); iteration = Math.floor(input_list.length / chunk_size); if (input_list.length % chunk_size != 0) { iteration += 1; } for (i = 1; i < iteration; i++) { start_index = end_index; end_index = chunk_size * (i + 1); chunked_list = input_list.slice(start_index, end_index); output_list.push(chunked_list); } return output_list; } function chunk_slice_push(input_list, chunk_size) { const output_list = []; let start_index = 0; while (start_index < input_list.length) { output_list.push(input_list.slice(start_index, parseInt(start_index + chunk_size))); start_index += chunk_size; } return output_list; } // ---------------------------- TEST --------------------------- divider_dash = '-----------------' green_apple = '🍏' red_apple = '🍎' const test_input_lists = [ [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 2, 3], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13] ]; const test_expected_outputs = [ [ [1, 2], [3, 4], [5, 6], [7, 8], [9, 10] ], [ [1], [2], [3] ], [ [1, 2, 3], [4, 5] ], [ [1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13] ] ]; const test_chunk_sizes = [2, 1, 3, 5]; var count = 0; for (let test_input_list of test_input_lists) { const test_output_form_chunk_push = chunk_with_push(test_input_list, test_chunk_sizes[count]); const test_output_form_chunk_slice = chunk_with_slice(test_input_list, test_chunk_sizes[count]); const test_output_form_chunk_slice_push = chunk_slice_push(test_input_list, test_chunk_sizes[count]); console.log(divider_dash); if (JSON.stringify(test_expected_outputs[count]) === JSON.stringify(test_output_form_chunk_push) && JSON.stringify(test_expected_outputs[count]) === JSON.stringify(test_output_form_chunk_slice) && JSON.stringify(test_expected_outputs[count]) === JSON.stringify(test_output_form_chunk_slice_push) ) { console.log(green_apple + ' Test ' + parseInt(count + 1) + ' was successful.'); } else { console.log(red_apple + ' Test ' + parseInt(count + 1) + ' was successful'); } count++; }

PHP

// Problem // Given a list and chunk size, divide the list into sublists // where each sublist is of chunk size // --- Examples: (Input List, Chunk Size) => Output List // ([1, 2, 3, 4], 2) => [[ 1, 2], [3, 4]] // ([1, 2, 3, 4, 5], 2) => [[ 1, 2], [3, 4], [5]] // ([1, 2, 3, 4, 5, 6, 7, 8], 3) => [[ 1, 2, 3], [4, 5, 6], [7, 8]] // ([1, 2, 3, 4, 5], 4) => [[ 1, 2, 3, 4], [5]] // ([1, 2, 3, 4, 5], 10) => [[ 1, 2, 3, 4, 5]] // function builtInChunk($input_list, $input_size) { if ($input_size <= 0 or !is_array($input_list)) { return false; } return array_chunk($input_list, $input_size); } // ---------------------------- TEST --------------------------- $divider_dash = PHP_EOL . '-----------------' . PHP_EOL; $green_apple = '🍏'; $red_apple = '🍎'; $test_input_lists = [ [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 2, 3], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], ]; $test_expected_outputs = [ [ [1, 2], [3, 4], [5, 6], [7, 8], [9, 10], ], [ [1], [2], [3], ], [ [1, 2, 3], [4, 5], ], [ [1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13], ], ]; $test_chunk_sizes = [2, 1, 3, 5]; foreach ($test_input_lists as $key => $test_input_list) { $test_output_form_chunk_slice = builtInChunk($test_input_list, $test_chunk_sizes[$key]); print($divider_dash); if ($test_expected_outputs[$key] === $test_output_form_chunk_slice) { print($green_apple . ' Test ' . intval($key + 1) . ' was successful.'); } else { print($red_apple . ' Test ' . intval($key + 1) . ' was successful'); } } 

Output

-------------------------------------------------- 🍏 Test 1 was successful. -------------------------------------------------- 🍏 Test 2 was successful. -------------------------------------------------- 🍏 Test 3 was successful. -------------------------------------------------- 🍏 Test 4 was successful. 
\$\endgroup\$
1
  • 1
    \$\begingroup\$Rather than posting redundant advice for the php snippet, apply AJNeufeld's advice about throwing and catching. An off-topic question: Why not merge this account with your Stack Overflow account?\$\endgroup\$CommentedOct 16, 2019 at 5:26

3 Answers 3

3
\$\begingroup\$

Argument Checking and Duck-typing

Two problems with this code:

 if chunk_size <= 0 or not isinstance(input_list, list): return False 

If the arguments are the correct type, the function returns a list of lists. If not, it returns False? This is going to complicate the code which uses the function. It is better to raise an exception:

 if chunk_size <= 0: raise ValueError("Chunk size must be positive") 

Secondly, you're requiring the input list to actually be a list. There are many Python types which can behave like a list, but are not instance of list. Tuples and strings are two easy examples. It is often better to just accept the type provided, and try to execute the function with the data given. If the caller provides a something that looks like a duck and quacks like a duck, don't complain that a drake is not a duck.

Oddity

In this statement ...

chunked_list = input_list[start_index: int(start_index + chunk_size)] 

... what is that int() for? It seems unnecessary.

Return Type

You are breaking the list into multiple lists. Your return is not just a simple list; it is a list of lists. List[List[T]].

Itertools

Python comes with many tools builtin. Check out the itertools modules. In particular, look at the grouper() recipe. It reduces your "chunking" code into 2 statements.

>>> from itertools import zip_longest >>> >>> input_list= [10, 2, 3, 40, 5, 6, 7, 88, 9, 10, 11] >>> chunk_size = 2 >>> >>> args = [iter(input_list)] * chunk_size >>> chunks = list(zip_longest(*args)) >>> >>> chunks [(10, 2), (3, 40), (5, 6), (7, 88), (9, 10), (11, None)] >>> 

As demonstrated above, it produces slightly different output. The last chunk is padded with None values until it is the proper length. You could easily filter out these extra values.

>>> chunks[-1] = [item for item in chunks[-1] if item is not None] >>> chunks [(10, 2), (3, 40), (5, 6), (7, 88), (9, 10), [11]] >>> 

If your list can contain None values, then a sentinel object could be used instead. Revised code:

from itertools import zip_longest def chunk_itertools(input_list: List[T], chunk_size: int) -> List[List[T]]: sentinel = object() args = [iter(input_list)] * chunk_size chunks = list(zip_longest(*args), fillvalue=sentinel) chunks[-1] = [item for item in chunks[-1] if item is not sentinel] return chunks 

JavaScript & PHP reviews left to others

\$\endgroup\$
    3
    \$\begingroup\$

    You have some odd stuff going on in chunk_with_slice; specifically these lines:

    def chunk_with_slice(input_list: List['T'], chunk_size: int) -> List['T']: if chunk_size <= 0 or not isinstance(input_list, list): return False 

    First, you're using strings to represent type vars (List['T']). As mentioned in the comments, strings are used to reference types that can't be referred to directly, such as needing a forward reference, or referencing something in a separate file that can't be imported. You don't need a forward reference here though. I'd just use T:

    def chunk_with_slice(input_list: List[T], chunk_size: int) -> List[T]: 

    Second, think about those first two lines. You're basically saying "input_list is a list. But if it's not a list, return false". If you say that input_list is a list, you shouldn't be allowing other types in. You've told the type checker to help you catch type errors, so ideally you shouldn't also be trying to recover from bad data at runtime. If you expect that other types may be passed in, specify that using Union (such as Union[List, str] to allow a List and str in).


    I also agree with @AJ's "It is often better to just accept the type provided, and try to execute the function with the data given." comment. I'd specify input_list to be a Sequence instead. A Sequence is a container capable of indexing/slicing and len. This covers lists, tuples, strings, and others; including user-defined types. Forcing your caller to use a specific container type is unnecessary in most circumstances.

    \$\endgroup\$
    2
    • \$\begingroup\$PEP484 allows type-hints to be given as strings, to allow forward references.\$\endgroup\$
      – AJNeufeld
      CommentedOct 16, 2019 at 13:32
    • \$\begingroup\$@AJNeufeld Ahh, thanks. A forward reference isn't necessary here though. I'll update my answer.\$\endgroup\$CommentedOct 16, 2019 at 14:01
    2
    \$\begingroup\$

    For sequences (list, str, tuple, etc), this is rather straight forward:

    def chunks(sequence, size): return [sequence[index:index+size] for index in range(0, len(sequence), size)] 
    \$\endgroup\$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.