I'm solving a linear programming problem, made of many small problems with some common shared constraints, meaning that they are all together.
Each individual unit has constraints unique to the unit and also global constraints which are influenced by all units. I've implemented this by making the constraint matrix like:
+-------+-----------------------+-----------------------+-----+-----------------------+-----------------------+ | o | Constraint_set1 | Constraint_set2 | ... | Constraint_setN | Global Constraints | +-------+-----------------------+-----------------------+-----+-----------------------+-----------------------+ | Unit1 | Component_Constraints | 0 | ... | 0 | Component_Constraints | | Unit2 | 0 | Component_Constraints | ... | 0 | Component_Constraints | | ... | ... | ... | ... | 0 | Component_Constraints | | UnitN | 0 | 0 | 0 | Component_Constraints | Component_Constraints | +-------+-----------------------+-----------------------+-----+-----------------------+-----------------------+
The reasoning for which I've taken from here - https://webgol.dinfo.unifi.it/OptimizationModels/MultiPeriodModels.html
This serves as the A
Matrix in Ax <= b
The following code is to add to the above to add further percentage limits to each of the components in each unit.
Setup code:
import numpy as np import pandas as pd from IPython.display import display # %% num_of_comp_id_cols = 2 # Name and Group, but more possible components_str = """Name,Group,OtherVals Component0,ComponentGroup0,4 Component2,ComponentGroup1,2""" limits_string = """Name,Component,Value Item1,Component1,0.2 Group1,Component2,0.1 Item4,ComponentGroup1,0.2 Group2,ComponentGroup0,0.9""" # grouped components and grouped units num_of_unit_id_cols = 2 # Name and Group, but more possible units_string = """Name,Group,OtherVals Item1,Group1,3 Item2,Group2,1 Item3,Group1,4 Item4,Group2,1""" components = pd.DataFrame([x.split(',') for x in components_str.split('\n')[1:]], columns=components_str.split('\n')[0].split(',')) num_components = len(components) limits = pd.DataFrame([x.split(',') for x in limits_string.split('\n')[1:]], columns=limits_string.split('\n')[0].split(',')) num_limits = len(limits) limits['Value'] = limits['Value'].astype(np.float32) units = pd.DataFrame([x.split(',') for x in units_string.split('\n')[1:]], columns=units_string.split('\n')[0].split(',')) num_units = len(units)
That is the data loaded and setup, the main code is:
# This is per limit: results = [] limits_T = limits.T for limit in limits_T: # Iterate over each limit in limits # Find which units the limit applies to, they could be referenced by name # or by group checks = np.tile(limits_T[limit].Name, (num_units, num_of_unit_id_cols)) impacted_units = units[['Name', 'Group']] == checks impacted_units = impacted_units.any(axis=1) # Impacted units is now 1xlen bool of which units are selected for # current limit # Each unit needs to be in a separate column as in the linear programming # problem they are independent in this aspect. part_eye = impacted_units.values * np.eye(num_units) # Remove the 0 columns: part_eye = part_eye[:, ~(part_eye == 0).all(axis=0)] # The above two steps does something similar to block diag, but not quite.. # Now building the limits for each unit, for percentage limits a linear # programming array can be built with [-a, -a, -a, -a+1].dot(X) <=0 to # constrain the last element to be less than a% of the total. (Similarly # for groups [-a+1, -a, -a, -a+1] would constrain the sum of the first and # last to be <= a%) checks2 = np.tile(limits_T[limit].Component, (num_components, num_of_comp_id_cols)) impacted_components = components[['Name', 'Group']] == checks2 impacted_components = impacted_components.any(axis=1) percentage_vec = np.ones((num_components, 1)) * -limits_T[limit].Value percentage_vec = percentage_vec + impacted_components.values.astype(int)[:, None] # vec now built # use this limit vector for each of the impacted units: single_limit_result = np.kron(part_eye, percentage_vec) results.append(single_limit_result) result = np.concat(results, axis=1)
# result array([[-0.2, -0.1, -0. , -0. , 0. , 0. ], [-0.2, 0.9, 0. , 0. , -0. , -0. ], [-0. , -0. , -0. , -0. , 0.1, 0. ], [-0. , 0. , 0. , 0. , -0.9, -0. ], [-0. , -0. , -0.1, -0. , 0. , 0. ], [-0. , 0. , 0.9, 0. , -0. , -0. ], [-0. , -0. , -0. , -0.2, 0. , 0.1], [-0. , 0. , 0. , 0.8, -0. , -0.9]])
This works and gives the expected results, but the looping over the limits is not elegant. Any feedback on the code in general would be great, and any comments on if there is a more elegant way (and hopefully quicker at scale) way to build the same matrix?
For a more elegant way I currently have the following which calculates the impacted units for all limits:
(units[['Name', 'Group']*num_limits] == np.tile( np.repeat(limits['Name'], num_of_unit_id_cols), (num_units, 1)) ).astype(int)
But, I don't know of a good way to do something akin to block_diag and get each component constraint on a different column from here. In the for
loop, this is completed by multiplying by I
, but I don't see how that would work here.