This post is based on the 0-1 Knapsack problem. I came across this problem in Assignment #4 of Professor Tim Roughgarden's course Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming on Coursera.
Question 1:
In this programming problem and the next you'll code up the knapsack algorithm from lecture.
Let's start with a warm-up. Download the text file below.
This file describes a knapsack instance, and it has the following format:
[knapsack_size][number_of_items]
[value_1] [weight_1]
[value_2] [weight_2]
...
For example, the third line of the file is "50074 659", indicating that the second item has value 50074 and size 659, respectively.
You can assume that all numbers are positive. You should assume that item weights and the knapsack capacity are integers.
In the box below, type in the value of the optimal solution.
My program:
def knapSack(W , wt , val , n): # Base Case if n == 0 or W == 0 : return 0 # If weight of the nth item is more than Knapsack of capacity # W, then this item cannot be included in the optimal solution if (wt[n-1] > W): return knapSack(W , wt , val , n-1) # Check for required value in lookup table first if (lookup[n][W]!=-1): return lookup[n][W] # Add to lookup table and return the maximum of two cases: # (1) nth item included # (2) not included else: x = max(val[n-1] + knapSack(W-wt[n-1] , wt , val , n-1), knapSack(W , wt , val , n-1)) lookup[n][W] = x return x #End of function knapSack #To test above function val = [] wt = [] with open("knapsack.txt") as f: first_line = f.readline().split() W = int(first_line[0]) n = int(first_line[1]) for line in f: words = line.split() val.append(int(words[0])) wt.append(int(words[1])) lookup = [[-1 for i in range(W+1)] for j in range(n+1)] print knapSack(W, wt, val, n)
This program gave me the output as 2493893, which matches with the official solution. So I presume that my code is correct. So far so good. The next question in the same assignment is as follows.
Question 2:
This problem also asks you to solve a knapsack instance, but a much bigger one.
Download the text file below.
This file describes a knapsack instance, and it has the following format:
[knapsack_size][number_of_items]
[value_1] [weight_1]
[value_2] [weight_2]
...
For example, the third line of the file is "50074 834558", indicating that the second item has value 50074 and size 834558, respectively. As before, you should assume that item weights and the knapsack capacity are integers.
This instance is so big that the straightforward iterative implemetation uses an infeasible amount of time and space. So you will have to be creative to compute an optimal solution. One idea is to go back to a recursive implementation, solving subproblems --- and, of course, caching the results to avoid redundant work --- only on an "as needed" basis. Also, be sure to think about appropriate data structures for storing and looking up solutions to subproblems.
In the box below, type in the value of the optimal solution.
My program (same as before):
def knapSack(W , wt , val , n): # Base Case if n == 0 or W == 0 : return 0 # If weight of the nth item is more than Knapsack of capacity # W, then this item cannot be included in the optimal solution if (wt[n-1] > W): return knapSack(W , wt , val , n-1) # Check for required value in lookup table first if (lookup[n][W]!=-1): return lookup[n][W] # Add to lookup table and return the maximum of two cases: # (1) nth item included # (2) not included else: x = max(val[n-1] + knapSack(W-wt[n-1] , wt , val , n-1), knapSack(W , wt , val , n-1)) lookup[n][W] = x return x #End of function knapSack #To test above function val = [] wt = [] with open("knapsack_big.txt") as f: first_line = f.readline().split() W = int(first_line[0]) n = int(first_line[1]) for line in f: words = line.split() val.append(int(words[0])) wt.append(int(words[1])) lookup = [[-1 for i in range(W+1)] for j in range(n+1)] print knapSack(W, wt, val, n)
The second question had mentioned that the ordinary iterative approach would not suffice and that we'd have to get back to the recursive approach and use appropriate caching. Fair enough. I had already used the recursive approach in my initial program and also implemented a lookup table for memoization purposes.
It worked perfectly fine on the smaller data set i.e. knapsack.txt
and took less than a second to execute. However, when I execute the program on the second file knapsack_big.txt
(which is considerably larger than knapsack.txt
) it takes forever to execute and in most cases ends up getting killed.
So, any idea how to speed up the program so that it can be executed on larger data sets? I suspect that there might be more efficient data structures (than a simple 2D list) for this problem, but not sure which.
knapsack.txt
as it's relatively short.\$\endgroup\$val
andwt
, you keep a list of tuples containing(val, wt)
? It appears you only use them for same-index values, so it might be faster on both the append and the look-up.\$\endgroup\$