9
\$\begingroup\$

The problem given was: "You will be given an array of numbers. You have to sort the odd numbers in ascending order while leaving the even numbers at their original positions."

I developed a solution that works, but I want to know if I'm redundant or if I'm engaging in some bad habit. I'm relatively new to python. I see a lot of people talk about the idea of 'time complexity' and I'm curious if I should worry about that at a beginner level?

Below is my solution:

def sort_array(source_array): z,e = [],[] o = list(enumerate(source_array)) [z.append(i) if i[1] % 2 == 1 else e.append(i) for i in o] z.sort(key = lambda z: z[1]) [z.insert(i[0],i) for i in e] return [l[1] for l in z] 

The way I see what is happening:

  1. Create two lists initially to serve as a landing space for the sorting I'll do.
  2. Create a set of Tuples to correspond each value to the original index
  3. Using list comprehension, add tuples to z and e depending on if they are odd or even
  4. Sort z (odd values list) targeting the source_array value
  5. Insert values from e into z based off the index from source_array
  6. return answer list by extracting target values from tuples.

It works, but I don't know enough to know if I'm displaying some bad habit?

\$\endgroup\$
0

    3 Answers 3

    14
    \$\begingroup\$

    Naming: avoid misleadingly generic names. If a function does something unusual (like sorting only the odd values in a sequence) don't be satisfied with a name implying that it does something ordinary (like sorting an array).

    Naming: when feasible, be specific and accurate. Your code relies on three cryptically named lists: z, e, and o. What do those names communicate to a reader? Not much. Meanwhile, during iteration, you use the variable name i, which is a conventional name in looping contexts for a sequence index -- but in this case the variable does not refer to an index but to an (INDEX, NUMBER) tuple.

    Don't use list comprehensions for their side effects. A list comprehension creates a new list. If that's your goal, using a comprehension might be a good idea, and people reading your code won't be surprised. But if your goal is something else -- for example, iterating over one list in order to append-to or insert-into some other list -- use a regular for-loop.

    Avoid situations where the algorithm relies on mystery indexes. Your code revolves around the index-number tuples mentioned above. That means the reader must read extra carefully to notice whether the index 0 or 1 is used when addressing one of those tuples. Why make us work so hard? Why create a situation prone to misreadings and mistakes in the first place? Python makes it easy to unpack those tuples into more communicative names: for example, return [n for i, n in tups].

    A simpler approach to consider. Your current approach has more moving parts than necessary. Here's a simpler plan: (1) extract the odd numbers, (2) sort them, (3) put them back into the spots previously occupied by an odd number.

    def sorted_odds_only(nums): ''' Takes a sequence of numbers. Returns a list with the odd numbers sorted and the even numbers left in their original positions. ''' odds = [n for n in nums if n % 2] odds.sort(reverse = True) return [ odds.pop() if n % 2 else n for n in nums ] 
    \$\endgroup\$
    1
    • 2
      \$\begingroup\$Thank you very much for the comprehensive and useful feedback!\$\endgroup\$CommentedJun 24, 2023 at 17:06
    9
    \$\begingroup\$

    Your solution is correct, however it is a little hard to read. You work with indices, which you hold in a tuple with the value. Such working with indices is very common in C-style languages, in Python you want to avoid this. Also your variable names are far from self-explaining.

    Algorithm

    I simply extract all odd numbers and sort them. Then I compose a list to return element wise from the input, if even, or from the sorted odd numbers otherwise.

    def sort_array(source_array): odds = sorted([n for n in source_array if n%2]) iter_odd = iter(odds) return [next(iter_odd) if n%2 else n for n in source_array] 

    Some remarks on your code line by line

    def sort_array(source_array):

    The trailing _array in the names is redundant and shall be omitted. The function name may have a trailing descriptive element and read e.g. sort_odd.

    z,e = [],[]

    These are non-descriptive short names in a bigger context. Names should read e.g. odds, evens. Also do a whitespace afte the comma. Initialize two variables in a single line is ok here, as both are clearly related and complementary. Both are not required in the next line - move the line down.

    o = list(enumerate(source_array))

    again bad name - write e.g. indexed

    [z.append(i) if i[1] % 2 == 1 else e.append(i) for i in o]

    comprehension

    This ins not a regular list comprehension, the resulting list holding the return values of the different append() calls is dropped, only the side effects persist. Write explicit loops in that case.

    tuples

    When working with tuples try to unpack to named vars

    [z.append((i, n)) if n % 2 == 1 else e.append((i, n)) for i, n in indexed] 

    or use namedtuple.

    from collections import namedtuple IndexedNumber = namedtuple("IndexedNumber", "i, n") indexed = [IndexedNumber(*x) for x in enumerate(l)] [z.append(inum) if inum.n % 2 == 1 else e.append(inum) for inum in indexed] 

    Avoid tuple indexing whenever possible, it is error prone and hard to maintain.

    z.sort(key = lambda z: z[1])

    Here you use z for the outer list and also for the lambda. A short name for the lambda is okay, but do not hide an outer var. This should read odds.sort(key=lambda e: e[1]) or with namedtuple odds.sort(key=lambda inum: inum.n)

    [z.insert(i[0],i) for i in e]

    Here you insert the even numbers at the correct positions into the list of odd numbers. Again you do this in a perverted comprehension. Use an explicit loop for "discarding" comprehensions. Also note that inserting is a costly operation. An algorithm "zipping" into a new list would be much better.

    return [l[1] for l in z]

    Again tuple indexing, this should read like return [num for _, num in z]

    Considering all that beautifying

    Your code reads now

    def sort_array(source_array): indexed = list(enumerate(source_array)) odds, evens = [], [] for idx, num in indexed: if num % 2 == 1: odds.append((idx, num)) else: evens.append((idx, num)) odds.sort(key = lambda e: e[1]) for idx, num in evens: odds.insert(idx, (idx, num)) return [num for _, num in odds] 

    For algorithmic improvement see the top of my answer

    \$\endgroup\$
    4
    • \$\begingroup\$Thank you very much for the comprehensive and useful feedback!\$\endgroup\$CommentedJun 24, 2023 at 17:05
    • \$\begingroup\$@Reinderien I appreciate the performance investigation, but you did not measure stephan's best implementation: the first one, using next(). I don't know if that approach will be fast or slow, but it's a nice technique to know about relative to the old-school (and counter-intuitive) reverse-then-pop approach that I learned back in my Perl days.\$\endgroup\$
      – FMc
      CommentedJun 25, 2023 at 18:48
    • \$\begingroup\$@FMc see edit: the first method presented is faster\$\endgroup\$CommentedJun 26, 2023 at 2:32
    • 1
      \$\begingroup\$@FMc, Reinderien - I'm late to the party :-). To me the pop reversed trick was new. That's why we are her, aren't we?\$\endgroup\$
      – stefan
      CommentedJun 26, 2023 at 13:03
    2
    \$\begingroup\$

    Let's talk about performance. There is a threshold in the neighbourhood of ~30-100 input size (depending on method and density of odd numbers) before which overhead of Numpy makes it slower, and beyond which pure Python code will never be faster than vectorized code.

    If you need to use pure Python as you are now, replace your approach with that of stefan. If you don't need to use pure Python, and especially if you expect long input, use Numpy. To compare,

    from timeit import timeit from typing import Sequence import matplotlib.pyplot as plt import numpy as np import pandas as pd import seaborn as sns from numpy.random import default_rng def orig(source_array: Sequence[int]) -> list[int]: z,e = [],[] o = list(enumerate(source_array)) [z.append(i) if i[1] % 2 == 1 else e.append(i) for i in o] z.sort(key=lambda z: z[1]) [z.insert(i[0],i) for i in e] return [l[1] for l in z] def fmc(nums: Sequence[int]) -> list[int]: odds = [n for n in nums if n % 2] odds.sort(reverse=True) return [ odds.pop() if n % 2 else n for n in nums ] def stefan(source_array: Sequence[int]) -> list[int]: odds = sorted(n for n in source_array if n%2) iter_odd = iter(odds) return [next(iter_odd) if n%2 else n for n in source_array] def with_array_index(source_array: np.ndarray) -> np.ndarray: output = source_array.copy() odds_idx, = (source_array % 2).nonzero() output[odds_idx] = np.sort(output[odds_idx]) return output METHODS = orig, stefan, fmc, with_array_index def test() -> None: slybot_input = np.array([0, 1, 9, 5, 4, 8, 2, 4, 1, 2, 3, 2, 10, 11, 25, 25, 24, 35, 1, 2, 3, 4, 5]) expected = np.array([0, 1, 1, 1, 4, 8, 2, 4, 3, 2, 3, 2, 10, 5, 5, 9, 24, 11, 25, 2, 25, 4, 35]) for method in METHODS: orig_input = slybot_input.copy() output = method(slybot_input) assert np.all(output == expected) assert np.all(orig_input == slybot_input) def time() -> None: rand = default_rng(seed=0) sizes = (10**np.linspace(start=1, stop=4, num=30)).astype(int) measurements = [] for size in sizes: for odd_density in (0.1, 0.5, 0.9): np_inputs = 2*rand.integers(low=-1000, high=1000, size=size) np_inputs[rand.random(size) < odd_density] += 1 list_inputs = np_inputs.tolist() all_inputs = (list_inputs,) * (len(METHODS)-2) + (np_inputs,)*2 reps = 2 * sizes.max() // size for _ in range(reps): for method, inputs in zip(METHODS, all_inputs): def run(): return method(inputs) dur = timeit(stmt=run, number=1) measurements.append((size, odd_density, method.__name__, dur)) df = pd.DataFrame( data=measurements, columns=('size', 'odd_density', 'method', 'duration'), ) fig, ax = plt.subplots() sns.lineplot(ax=ax, data=df, x='size', y='duration', hue='method', style='odd_density') ax.set_xscale('log') ax.set_yscale('log') plt.show() if __name__ == '__main__': test() time() 

    comparison

    \$\endgroup\$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.