No need to compute permutations
As you are provided all the b1,b2,…,bN−1
values, the problem boils down to finding the first value a1
and everything can be computed from there. In particular, you have only N
different values to try. We can try see them like this:
n = 5 b_array = [4, 6, 7, 6] for a in range(1, n+1): a_array = [a] for b in b_array: a = b - a a_array.append(a) print(a_array)
which gives
[1, 3, 3, 4, 2] # Duplicate 3 [2, 2, 4, 3, 3] # Duplicate 2 [3, 1, 5, 2, 4] # Looks good [4, 0, 6, 1, 5] # 6 is out of range [5, -1, 7, 0, 6] # -1, 7, 6, 0 are out of range
Because we go with increasing values of a1
, the first candidate that does match is the expected solution (the "lexicographically minimum").
Thus, you could just try to get rid of the impossible solution and you'll find the correct one:
def get_solution(n, b_array): for a in range(1, n+1): a_array = [a] for b in b_array: a = b -a if 0 < a <= n and a not in a_array: # To be performed with a set for O(1) check a_array.append(a) else: break else: # No break return a_array n = 5 b_array = [4, 6, 7, 6] print(get_solution(n, b_array))
This solution is O(n*n): n attempts (or maybe n/2 on average) which iterate over n elements (or maybe n/2 on average).
It would be nice to find an O(n) solution (which is the best expectable solution).
Mathematical observations
We have: bi = ai + ai+1
which means ai+1 = bi - ai
which is the property we've used previously. From there, we can also compute: ai+2 = bi+1 - ai+1 = bi+1 - bi + ai
. It doesn't look very interesting at first but it means that when if you try to increase/decrease ai
from a fixed value, the value ai+2
will vary accordingly (and so will the value of ai+4
, ai+6
, etc).
This can be observed in the options we generated previously:
[1, 3, 3, 4, 2] # Duplicate 3 [2, 2, 4, 3, 3] # Duplicate 2 [3, 1, 5, 2, 4] # Looks good [4, 0, 6, 1, 5] # 6 is out of range [5, -1, 7, 0, 6] # -1, 7, 6, 0 are out of range ^______^_________always +2 from column 1 to 3 ^____________^___always +1 from column 1 to 5 ^_____^______always +1 from column 2 to 4
Note: this property hold when we take every other column but not when we take 2 consecutives values because ai+2 - ai = bi+1 - bi
is a fixed value (which is convenient) while ai+1 - a1 = bi - 2*ai
depends on the value of ai
.
How could this help us ?
Well, in your case, we can see that a3 = a1 + 2 which tells us straightaway that a1 can't be 4 or above. With a bigger list, this would provide us a smaller range to check for a1
(both from the starting limit and the ending limit).
Similarly, we can get stronger constraints on a2
.
My final version of the code
In case it can be useful, here is my implementation. It doesn't use all the excellent ideas from hjpotter92's answers:
# https://codereview.stackexchange.com/questions/250061/efficient-permutation-checking-for-usaco-photoshoot-python # http://www.usaco.org/index.php?page=viewproblem2&cpid=988 import random def generate_b_array(a_array): """Generate 'b' array from the 'a' array.""" return [previous + current for previous, current in zip(a_array, a_array[1:])] def generate_problem(n): """Generate a problem of size n.""" # TODO: It is a bit trickier than this, we may generate solutions which # are not the lexicographical minimum: # For instance, both [4, 3, 2, 1] and [3, 4, 1, 2] lead to b being [7, 5, 3] # But only the second one is an acceptable solution # This issue gets less and less common as n increases. a = list(range(1, n+1)) random.shuffle(a) return a, generate_b_array(a) def get_solution_naive(b_array): """This is the most straight-forward solution: generate and check.""" n = len(b_array) + 1 expected_set = set(range(1, n+1)) for a in range(1, n+1): a_array = [a] for b in b_array: a = b -a a_array.append(a) if set(a_array) == expected_set: return a_array def get_solution_naive_early_stop(b_array): """Smarter (?) solution: trying to stop when generated data is not valid.""" n = len(b_array) + 1 for a in range(1, n+1): a_array = [a] for b in b_array: a = b -a if 0 < a <= n and a not in a_array: a_array.append(a) else: break else: # No break return a_array def get_solution_naive_early_stop_set(b_array): """Smarter (?) solution: trying to stop when generated data is not valid, using the most appropriate data structure.""" n = len(b_array) + 1 for a in range(1, n+1): a_array = [a] a_set = set(a_array) for b in b_array: a = b -a if 0 < a <= n and a not in a_set: a_array.append(a) a_set.add(a) else: break else: # No break return a_array def get_solution(b_array): """Optimised solution: rely on mathematical observation to limit the number of values checked.""" n = len(b_array) + 1 # Use properties on alternate indices to restrict values to look for for a1 b_diff = [current - previous for previous, current in zip(b_array, b_array[1:])] delta_from_a1 = [0] for delta in b_diff[::2]: delta_from_a1.append(delta + delta_from_a1[-1]) min_delta_a1, max_delta_a1 = min(delta_from_a1), max(delta_from_a1) delta_from_a2 = [0] for delta in b_diff[1::2]: delta_from_a2.append(delta + delta_from_a2[-1]) min_delta_a2, max_delta_a2 = min(delta_from_a2), max(delta_from_a2) # We want: 0 < a_n <= N for all n # With a_n = a_1 + delta_n, we want: -delta_n < a_1 <= N - delta_n for all n # In particular: # -min(delta) < a_1 <= N - max(delta) a1_min1 = 1 - min_delta_a1 a1_max1 = n - max_delta_a1 # We want: 0 < a_n <= N for all n # With a_n = a_2 + delta_n and a2 = b1 - a1, we want: # 0 < b_1 - a_1 + delta_n <= N for all n # - delta_n - b_1 < - a_1 <= N - delta_n - b_1 # delta_n + b_1 > a_1 >= delta_n + b_1 - N # In particular : # max(delta) + b-1 - N <= a_1 < min(delta) + b_1 a1_min2 = max_delta_a2 + b_array[0] - n a1_max2 = min_delta_a2 + b_array[0] - 1 a1_min = max(1, a1_min1, a1_min2) a1_max = min(n, a1_max1, a1_max2) # print("min", a1_min, 1, a1_min1, a1_min2) # print("max", a1_max, n, a1_max1, a1_max2) expected_set = set(range(1, n+1)) for a in range(a1_min, a1_max+1): a_array = [a] for b in b_array: a = b -a a_array.append(a) if set(a_array) == expected_set: return a_array def check_solution(a, b): """Test all implementations on a given problem.""" # TODO: Some performance benchmarking could be added here. a1 = get_solution_naive(b) a2 = get_solution_naive_early_stop(b) a3 = get_solution_naive_early_stop_set(b) a4 = get_solution(b) if a1 != a: print("a1", a, a1, b) if a2 != a: print("a2", a, a2, b) if a3 != a: print("a3", a, a3, b) if a4 != a: print("a4", a, a3, b) def perform_tests(): """Perform tests (hardcoded & randomly generated).""" a, b = [3, 1, 5, 2, 4], [4, 6, 7, 6] check_solution(a, b) a, b = [6, 3, 10, 4, 2, 1, 8, 7, 5, 9], [9, 13, 14, 6, 3, 9, 15, 12, 14] check_solution(a, b) a, b = [12, 5, 9, 6, 14, 10, 15, 13, 2, 7, 1, 8, 19, 11, 16, 20, 4, 18, 17, 3], [17, 14, 15, 20, 24, 25, 28, 15, 9, 8, 9, 27, 30, 27, 36, 24, 22, 35, 20] check_solution(a, b) # This case leads to more iterations than most other cases a = [137, 187, 70, 179, 198, 106, 156, 10, 144, 105, 196, 171, 89, 164, 186, 41, 165, 55, 114, 138, 101, 8, 75, 185, 174, 81, 158, 126, 181, 30, 178, 85, 180, 76, 28, 56, 62, 119, 193, 45, 94, 109, 172, 169, 149, 79, 189, 100, 125, 131, 57, 32, 139, 14, 27, 152, 47, 15, 147, 95, 39, 112, 118, 151, 38, 26, 175, 11, 200, 1, 161, 120, 63, 115, 12, 60, 129, 80, 7, 33, 192, 166, 123, 13, 23, 84, 116, 6, 163, 16, 52, 5, 18, 71, 154, 78, 69, 3, 49, 121, 199, 159, 72, 86, 36, 54, 92, 97, 35, 34, 150, 173, 168, 130, 190, 141, 99, 17, 87, 134, 20, 155, 111, 153, 170, 102, 31, 135, 9, 194, 50, 4, 128, 140, 143, 184, 40, 110, 191, 98, 44, 195, 58, 24, 65, 74, 64, 22, 197, 188, 113, 142, 136, 107, 103, 83, 19, 145, 53, 68, 167, 59, 177, 21, 182, 2, 122, 176, 117, 51, 48, 67, 183, 61, 82, 42, 124, 37, 93, 162, 73, 148, 127, 66, 146, 29, 91, 108, 160, 157, 104, 25, 132, 43, 96, 88, 46, 77, 133, 90] b = [324, 257, 249, 377, 304, 262, 166, 154, 249, 301, 367, 260, 253, 350, 227, 206, 220, 169, 252, 239, 109, 83, 260, 359, 255, 239, 284, 307, 211, 208, 263, 265, 256, 104, 84, 118, 181, 312, 238, 139, 203, 281, 341, 318, 228, 268, 289, 225, 256, 188, 89, 171, 153, 41, 179, 199, 62, 162, 242, 134, 151, 230, 269, 189, 64, 201, 186, 211, 201, 162, 281, 183, 178, 127, 72, 189, 209, 87, 40, 225, 358, 289, 136, 36, 107, 200, 122, 169, 179, 68, 57, 23, 89, 225, 232, 147, 72, 52, 170, 320, 358, 231, 158, 122, 90, 146, 189, 132, 69, 184, 323, 341, 298, 320, 331, 240, 116, 104, 221, 154, 175, 266, 264, 323, 272, 133, 166, 144, 203, 244, 54, 132, 268, 283, 327, 224, 150, 301, 289, 142, 239, 253, 82, 89, 139, 138, 86, 219, 385, 301, 255, 278, 243, 210, 186, 102, 164, 198, 121, 235, 226, 236, 198, 203, 184, 124, 298, 293, 168, 99, 115, 250, 244, 143, 124, 166, 161, 130, 255, 235, 221, 275, 193, 212, 175, 120, 199, 268, 317, 261, 129, 157, 175, 139, 184, 134, 123, 210, 223] check_solution(a, b) lengths = [3, 4, 5, 10, 15, 20, 100, 200, 500] lengths = [10, 10, 10, 10, 10, 15, 15, 20, 100, 200, 500] for i in lengths: a, b = generate_problem(i) check_solution(a, b) perform_tests()
The code review itself
Here are a few comments which could help you:
- write functions
- write tests for your functions: this is particularly easy here for 2 reasons: * you are provided a valid test case * new test cases are fairly easy to generate
Writing functions will help you to write easier to maintain code. Also, for the algorithmic challenges, it will help you to focus on the algorithm itself instead of dealing with input/output. Also, it helps to test various implementations, to test them, to compare them (both in correctness and in performances)