4
\$\begingroup\$

I have implemented a radix sort algorithm in Python 3. It first finds the maximum number of digits in the list, then changes them to strings and adds 0's. For example, [7, 23, 107, 1, 53] to ['007', '023', '107', '001', '053']. I then make a new matrix, [[] * 10]. I append the numbers with last digit of 0 in lst[0], numbers with last digit of 1 in lst[1], etc. I then flatten the matrix into a list, and recurse with the lst, and with the position one less than the one before (in this case, the second to last digit). Can I get advice on how to improve it? Here is my code:

def radix_sort(lst): '''list -> list''' #finds maximum number of digits in list num_of_digits = max([len(str(x)) for x in lst]) #Adds 0s to the front of the number if necessary and changes it to a string for x in range(len(lst)): lst[x] = '0' * (num_of_digits - len(str(lst[x]))) + str(lst[x]) #helper function to sort based on the position def helper(lst, pos): '''list, int -> list''' #places numbers with 0 in the position in the ans[0], 1 in the position in ans[1], etc. ans = [[] for _ in range(10)] #adding numbers to the position in ans for x in lst: ans[int(x[pos])].append(x) #flattened list, reusing lst to save memory lst = [] for x in ans: for y in x: lst.append(y) #we have sorted the whole list if pos == 0: return lst #recurse again with smaller position return helper(lst, pos - 1) #changing the strings to integers and returning return [int(x) for x in helper(lst, num_of_digits - 1)] 
\$\endgroup\$

    2 Answers 2

    1
    \$\begingroup\$
     #flattened list, reusing lst to save memory lst = [] 

    That doesn't save memory. The caller of the current helper call still has a reference to the "old" list, so it'll remain in memory in addition to this new one. You'll have one list per call level. To actually save memory, you could do lst.clear() instead, so that all levels use the same single list.

    To further save memory, do del ans, x, y after you rebuilt lst from them before you call helper recursively, otherwise they'll remain in memory as well, again once per call level.

    For example for lst = ['1' * 800] * 1000, these improvements changed the peak memory usage from over 15 MB to less than 0.5 MB in this test of mine:

    import tracemalloc lst = ['1' * 800] * 1000 tracemalloc.start() radix_sort(lst) peak = tracemalloc.get_traced_memory()[1] print(peak) 

    Or just do it iteratively instead of recursively, then you likely wouldn't have these issues in the first place and you wouldn't have to worry about recursion limit errors due to large numbers.

    Here's an iterative one with a few other changes:

    def radix_sorted(lst): '''list -> list''' lst = list(map(str, lst)) width = max(map(len, lst)) lst = [s.rjust(width, '0') for s in lst] for i in range(width)[::-1]: buckets = [[] for _ in range(10)] for s in lst: buckets[int(s[i])].append(s) lst = [s for b in buckets for s in b] return list(map(int, lst)) 
    \$\endgroup\$
    2
    • \$\begingroup\$Oh, thanks! I thought it was saving memory, I guess I was completely wrong. This will help my memory usage a lot. By the way, is there really a chance that I'll get a recursion limit error? The recursion depth is 1,000 which means it'll only pass recursion depth for 1000+ digit numbers, am I wrong?\$\endgroup\$
      – Sola Sky
      CommentedMar 23, 2021 at 23:04
    • \$\begingroup\$@SolaSky Yes, roughly at 1000 digits you'd hit the limit. Whether there's a chance that you have such numbers, I don't know, only you do :-). Btw I added an iterative one now.\$\endgroup\$CommentedMar 24, 2021 at 0:04
    1
    \$\begingroup\$

    It isn't necessary to convert the numbers to strings and zero-pad them.

    BASE = 10 def radix_sort(lst, base=BASE): biggest_number = max(lst) place = 1 while place <= biggest_number: buckets = [[] for _ in range(BASE)] for number in lst: index = number // place % BASE buckets[index].append(number) lst = [] for bucket in buckets lst.extend(bucket) place *= BASE return lst 
    \$\endgroup\$
    1
    • \$\begingroup\$You have presented an alternative solution, but haven't reviewed the code. Please explain your reasoning (how your solution works and why your solution is better than the original) so that the author and other readers can learn from your thought process.\$\endgroup\$
      – Peilonrayz
      CommentedMar 31, 2021 at 18:19

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.