I have been working on this question, https://open.kattis.com/problems/sequences.
You are given a sequence, in the form of a string with characters ‘0’, ‘1’, and ‘?’ only. Suppose there are n ‘?’s. Then there are 2ⁿ ways to replace each ‘?’ by a ‘0’ or a ‘1’, giving 2ⁿ different 0-1 sequences (0-1 sequences are sequences with only zeroes and ones).
For each 0-1 sequence, define its number of inversions as the minimum number of adjacent swaps required to sort the sequence in non-decreasing order. In this problem, the sequence is sorted in non-decreasing order precisely when all the zeroes occur before all the ones. For example, the sequence 11010 has 5 inversions. We can sort it by the following moves: 11010 → 11001 → 10101 → 01101 → 01011 → 00111.
Find the sum of the number of inversions of the 2ⁿ sequences, modulo 1000000007 (10⁹+7).
Input
The first and only line of input contains the input string, consisting of characters ‘0’, ‘1’, and ‘?’ only, and the input string has between 1 to 500000 characters, inclusive.Output
Output an integer indicating the aforementioned number of inversions modulo 1000000007
In summary, I am given a string of 1s, 0s, and ?s. All question marks are basically variables that can become 1s and 0s. In each case, I have to sum up the minimum number of inversions (a swap of two adjacent characters), needed to sort the string into non-decreasing order, i.e, all 0s before 1s. For example, 0?1? has four cases: 0010, 0011, 0110, and 0111, and need 1, 0, 2, and 0 inversions respectively, summing up to 3. I am able to get the answer right however my code is too slow:
string = input() zero = 0 # is the number of 0's total = 0 # total number of inversions questions = 0 # number of questions marks count = 0 # sum of indexes of questions marks for i in range(len(string)): if string[i] == '?': count = count + i questions = questions + 1 elif string[i] == '0': zero = zero + 1 total = total + i if questions != 0: total = total * 2 ** questions + count * 2 ** (questions - 1) triangular = 0 for i in range (zero): triangular = triangular + i Choosy = 1 for i in range(questions + 1): total = total - triangular * Choosy triangular = triangular + zero + i Choosy = Choosy * (questions - i) // (i + 1) print(total)
Basically, the idea of my code is that for each case, the number of inversions needed is simply the sum of all indexes of 0 takeaway the (number of 0s -1)th triangular number. Therefore, all I need to do is add up all of the indexes of 0s, times by the number of cases, and add all indexes of questions marks, times the number of cases/2, since each question mark is a 0 exactly half the time. I then calculate the number of cases that within the question marks, there are 0, 1, 2 ..., (number of question mark) 0s using combinations, times by the (number of 0s -1)th triangular number, and minus this off the total. My code works kind of quickly but as I start putting around 20,000 question marks as my input, it starts to take more than a second. However, I need it to take 500,000 question marks as input and return a value within a second. Any ideas on how to make it faster?