3
\$\begingroup\$

I'm trying to find all possible paths made of 0 from bottom left to top right, only by moving either up or to the right. The input file contains a grid of characters, where 1 represents an obstacle and 0 represents a vacant cell.

The problem I have is the code not finishing/ taking too long. The code works correctly with smaller inputs like this, but on a large file with 5000 rows and 5000 characters in each row it never finishes even after 20 minutes.

import sys sys.setrecursionlimit(900000000) file_e = open('file.txt','r') file = file_e.readlines() file_e.close() file.reverse() for index,line in enumerate(file): file[index] = line.split(' ') print(len(file)-1, len(file[0])-1) def end(file,y,x): if y == len(file)-1 and x == len(file[0])-1: return 1 try: if file[y+1][x] == '0': up = True except: pass try: if file[y][x+1] == '0': right = True except: pass try: if up == True and right == True: return end(file,y+1,x) + end(file,y,x+1) except: pass try: if up == True: return end(file,y+1,x) except: pass try: if right == True: return end(file,y,x+1) except: pass if file[y+1][x] == '1' or file[y][x+1] == '1': return 0 print(end(file,0,0)) 
\$\endgroup\$
8
  • \$\begingroup\$I don't see any comments about why this is broken, and at a glance I don't see anything broken, so voting to leave open.\$\endgroup\$CommentedMay 23, 2018 at 20:54
  • \$\begingroup\$Does the code work fine on smaller inputs?\$\endgroup\$
    – Phrancis
    CommentedMay 23, 2018 at 20:55
  • \$\begingroup\$yes it works with smaller inputs but with really large inputs like the one mentioned above doesn't stop running even after 20 minutes\$\endgroup\$
    – user170389
    CommentedMay 23, 2018 at 21:22
  • \$\begingroup\$Would this be an example of a smaller input??\$\endgroup\$CommentedMay 23, 2018 at 21:35
  • \$\begingroup\$yes, it would be\$\endgroup\$
    – user170389
    CommentedMay 23, 2018 at 21:38

3 Answers 3

3
\$\begingroup\$

No optimization will make this run effectively for large inputs because fundamentally the "all paths problem" is hard. In short, as the size of the input increases incrementally, the number of paths increases exponentially. Your best option may be to ask "what is a similar question I can ask that will give me a useful answer for less computational expense."

In terms of code quality, using try-catch structures for branching is often considered bad form. In many languages this would be a performance issue, but from my 30 seconds of googling it appears the performance impact isn't huge in python.

One adjustment you could make is to parse the file into an adjacency matrix and then using a known algorithm to solve for your lists of paths.

\$\endgroup\$
    3
    \$\begingroup\$

    Any solution using recursion is doomed to fail for a 5000×5000 grid. A path from one corner to the opposite corner would take 10000 steps, and thus the algorithm end up with a stack that is 10000 function calls deep. It will surely crash with a stack overflow error before that, if you have the patience.

    Furthermore, without , you will be doing repeated work for multiple paths that lead to the same intermediate point.

    \$\endgroup\$
      2
      \$\begingroup\$

      Several issues with your code which is causing you issues (this won't give you the final code but it will help you get there).

      Firstly, never hide issues by modifying the stack recursion, and never wrap operations in try: except: pass. The errors are there because you are doing operations the computer does not like (and you shouldn't be doing). As an analogy, you could drive sometimes on the wrong side of the road to get places faster - but you don't because it breaks convention.

      Next, update your code to be more python'ish. For example:

      import sys sys.setrecursionlimit(900000000) file_e = open('file.txt','r') file = file_e.readlines() 

      at the top of your code becomes:

      # coding=utf-8 def end(file, y, x): up = None right = None if y == len(file) - 1 and x == len(file[0]) - 1: 

      and the bottom of your code goes from:

       if file[y+1][x] == '1' or file[y][x+1] == '1': return 0 print(end(file,0,0)) 

      into:

      if __name__ == "__main__": with open('data.txt', 'r') as f: file = f.readlines() file.reverse() for index, line in enumerate(file): file[index] = line.split(' ') print(f"File data shows: {len(file) - 1} and {len(file[0]) - 1}") print(end(file, 0, 0)) 

      As for the point I made about about removing the try: except: pass: code, your statements should go from:

      try: if up == True and right == True: return end(file,y+1,x) + end(file,y,x+1) except: pass 

      into:

      if up and right: return end(file, y + 1, x) + end(file, y, x + 1) 

      because up==True is the same as up etc.

      When I run your code after making those changes, I do want to ask why you have the file "split" happening. When you perform that, (if I use the smaller example data on pastebin you linked to) I get the following structure as file:

      <class 'list'>: [['00000011\n'], ['11100001\n'], ['11100000\n'], ['00000001\n'], ['00000000\n']] 

      Is this an intentional design? If I look at the file data before the split happens, it is:

      <class 'list'>: ['00000011\n', '11100001\n', '11100000\n', '00000001\n', '00000000\n'] 

      which is easier to work with.

      Anyway, hope these code suggestions help you to write better code and update the design of your code to make the solution work for all data sets.

      Good luck!

      \$\endgroup\$