Simplex method (linear programming) implementation

Question

The up-to-date code, along some documentation, can be found here.

We've implemented a version of the Simplex method for solving linear programming problems. The concerns I have are with the design we adopted, and what would be some refactorings that would improve it overall.

We defined two important global functions, simplex and simplex_core. The former is a wrapper that does a bunch of error checking and then solves phase I and phase II of the simplex method by calling simplex_core. The latter is the actual bare-bones algorithm; it takes the problem data alongside an initial basic feasible solution and iterates until it fins an optimal solution or identifies the problem as unlimited.

""" ~Mathematical Programming~ Simplex implementation. """ import numpy as np from numpy.linalg import inv # Matrix inverse from numpy.matlib import matrix # Matrix data type np.set_printoptions(precision=3, threshold=10, edgeitems=4, linewidth=120) # Prettier array printing epsilon = 10**(-10) # Global truncation threshold def simplex(A: matrix, b: np.array, c: np.array, rule: int = 0) -> (int, np.array, float, np.array): """ Outer "wrapper" for executing the simplex method: phase I and phase II. :param A: constraint matrix :param b: independent terms in constraints :param c: costs vector :param rule: variable selection rule (e.g. Bland's) This function prints the outcome of each step to stdout. """ m, n = A.shape[0], A.shape[1] # no. of rows, columns of A, respectively """Error-checking""" if n < m: raise ValueError("Incompatible dimensions " "(no. of variables : {} > {} : no.of constraints".format(n, m)) if b.shape != (m,): raise ValueError("Incompatible dimensions: c_j has shape {}, expected {}.".format(b.shape, (m,))) if c.shape != (n,): raise ValueError("Incompatible dimensions: c has shape {}, expected {}.".format(c.shape, (n,))) "Check full rank matrix" if not np.linalg.matrix_rank(A) == m: # Remove ld rows: A = A[[i for i in range(m) if not np.array_equal(np.linalg.qr(A)[1][i, :], np.zeros(n))], :] m = A.shape[0] # Update no. of rows """Phase I setup""" A[[i for i in range(m) if b[i] < 0]] *= -1 # Change sign of constraints b = np.abs(b) # Idem A_I = matrix(np.concatenate((A, np.identity(m)), axis=1)) # Phase I constraint matrix x_I = np.concatenate((np.zeros(n), b)) # Phase I variable vector c_I = np.concatenate((np.zeros(n), np.ones(m))) # Phase I c_j vector basic_I = set(range(n, n + m)) # Phase I basic variable set """Phase I execution""" print("Executing phase I...") ext_I, x_init, basic_init, z_I, _, it_I = simplex_core(A_I, c_I, x_I, basic_I, rule) # ^ Exit code, initial BFS, basis, z_I, d (not needed) and no. of iterations print("Phase I terminated.") assert ext_I == 0 # assert that phase I has an optimal solution (and is not unlimited) if z_I > 0: print("\n") print_boxed("Unfeasible problem (z_I = {:.6g} > 0).".format(z_I)) print("{} iterations in phase I.".format(it_I), end='\n\n') return 2, None, None, None if any(j not in range(n) for j in basic_init): # If some artificial variable is in the basis for the initial BFS, exit: raise NotImplementedError("Artificial variables in basis") x_init = x_init[:n] # Get initial BFS for original problem (without artificial vars.) print("Found initial BFS at x = \n{}.\n".format(x_init)) """Phase II execution""" print("Executing phase II...") ext, x, basic, z, d, it_II = simplex_core(A, c, x_init, basic_init, rule) print("Phase II terminated.\n") if ext == 0: print_boxed("Found optimal solution at x =\n{}.\n\n".format(x) + "Basic indexes: {}\n".format(basic) + "Nonbasic indexes: {}\n\n".format(set(range(n)) - basic) + "Optimal cost: {}.".format(z)) elif ext == 1: print_boxed("Unlimited problem. Found feasible ray d =\n{}\nfrom x =\n{}.".format(d, x)) print("{} iterations in phase I, {} iterations in phase II ({} total).".format(it_I, it_II, it_I + it_II), end='\n\n') return ext, x, z, d def simplex_core(A: matrix, c: np.array, x: np.array, basic: set, rule: int = 0) \ -> (int, np.array, set, float, np.array): """ This function executes the simplex algorithm iteratively until it terminates. It is the core function of this project. :param A: constraint matrix :param c: costs vector :param x: initial BFS :param basic: initial basic index set :param rule: variable selection rule (e.g. Bland's) :return: a tuple consisting of the exit code, the value of x, basic index set, optimal cost (if optimum has been found), and BFD corresponding to feasible ray (if unlimited problem) """ m, n = A.shape[0], A.shape[1] # no. of rows, columns of A, respectively assert c.shape == (n,) and x.shape == (n,) # Make sure dimensions match assert isinstance(basic, set) and len(basic) == m and \ all(i in range(n) for i in basic) # Make sure that basic is a valid base B, N = list(basic), set(range(n)) - basic # Basic /nonbasic index lists del basic # Let's work in hygienic conditions B_inv = inv(A[:, B]) # Calculate inverse of basic matrix (`A[:, B]`) z = np.dot(c, x) # Value of obj. function it = 1 # Iteration number while it <= 500: # Ensure procedure terminates (for the min reduced cost rule) r_q, q, p, theta, d = None, None, None, None, None # Some cleanup print("\tIteration no. {}:".format(it), end='') """Optimality test""" prices = c[B] * B_inv # Store product for efficiency if rule == 0: # Bland rule optimum = True for q in N: # Read in lexicographical index order r_q = np.asscalar(c[q] - prices * A[:, q]) if r_q < 0: optimum = False break # The loop is exited with the first negative r.c. elif rule == 1: # Minimal reduced cost rule r_q, q = min([(np.asscalar(c[q] - prices * A[:, q]), q) for q in N], key=(lambda tup: tup[0])) optimum = (r_q >= 0) else: raise ValueError("Invalid pivoting rule") if optimum: print("\tfound optimum") return 0, x, set(B), z, None, it # Found optimal solution """Feasible basic direction""" d = np.zeros(n) for i in range(m): d[B[i]] = trunc(np.asscalar(-B_inv[i, :] * A[:, q])) d[q] = 1 """Maximum step length""" # List of tuples of "candidate" theta an corresponding index in basic list: neg = [(-x[B[i]] / d[B[i]], i) for i in range(m) if d[B[i]] < 0] if len(neg) == 0: print("\tidentified unlimited problem") return 1, x, set(B), None, d, it # Flag problem as unlimited and return ray # Get theta and index (in basis) of exiting basic variable: theta, p = min(neg, key=(lambda tup: tup[0])) """Variable updates""" x = np.array([trunc(var) for var in (x + theta * d)]) # Update all variables assert x[B[p]] == 0 z = trunc(z + theta * r_q) # Update obj. function value # Update inverse: for i in set(range(m)) - {p}: B_inv[i, :] -= d[B[i]]/d[B[p]] * B_inv[p, :] B_inv[p, :] /= -d[B[p]] N = N - {q} | {B[p]} # Update nonbasic index set B[p] = q # Update basic index list """Print status update""" print( "\tq = {:>2} \trq = {:>9.2f} \tB[p] = {:>2d} \ttheta* = {:>5.4f} \tz = {:<9.2f}" .format(q + 1, r_q, B[p] + 1, theta, z) ) it += 1 # If loop goes over max iterations (500): raise TimeoutError("Iterations maxed out (probably due to an endless loop)") def print_boxed(msg: str) -> None: """ Utility for printing pretty boxes. :param msg: message to be printed """ lines = msg.splitlines() max_len = max(len(line) for line in lines) if max_len > 100: raise ValueError("Overfull box") print('-' * (max_len + 4)) for line in lines: print('| ' + line + ' ' * (max_len - len(line)) + ' |') print('-' * (max_len + 4)) def trunc(x: float) -> float: """ Returns 0 if x is smaller (in absolute value) than a certain global constant. """ return x if abs(x) >= epsilon else 0

My main concern is that the simplex_core function is a pretty large chunk of code, and most of it is just a big loop. So, would it be wiser to break it up in smaller methods? If so, should they be local or global? I don't see it as an obvious thing to do because each "part" is executed only once, so what would be the advantage of defining a local function?

On the other hand, should simplex_core be a local function of simplex? The main reason I made it global is because the name of the parameters would overshadow the parameters of simplex, and I need those parameters (instead of just using nonlocal variables) due to the distinctness of phase I and phase II.

Finally, my other concern is performance. Since I am a relative beginner, it is hard for me to spot any subtle (or not so subtle) bottlenecks in the code that could be avoided.

Usage example

Consider the following linear programming problem (in standard form):

$$ \begin{cases}\begin{aligned}\min &&&& -x_1 &&-x_2\\ \text{s.t.} &&&& 3x_1 &&+2x_2 &&+x_3 && && = 4\\ &&&& && x_2 && &&+x_4 && = 3\\ \\ &&&& x_1,&&x_2,&&x_3,&&x_4 &&\ge 0\\ \end{aligned}\end{cases} $$

To solve this problem from the Python console, I would write

>>> import numpy as np >>> from simplex import simplex >>> A = np.matrix([[3, 2, 1, 0], [0, 1, 0, 1]]) >>> b = np.array([4, 3]) >>> c = np.array([-1, -1, 0, 0]) >>> simplex(A, b, c)

The output would be:

Executing phase I... Iteration no. 1: q = 1 rq = -3.00 B[p] = 1 theta* = 1.3333 z = 3.00 Iteration no. 2: q = 2 rq = -1.00 B[p] = 2 theta* = 2.0000 z = 1.00 Iteration no. 3: q = 4 rq = -1.00 B[p] = 4 theta* = 1.0000 z = 0.00 Iteration no. 4: found optimum Phase I terminated. Found initial BFS at x = [0. 2. 0. 1.]. Executing phase II... Iteration no. 1: found optimum Phase II terminated. --------------------------------- | Found optimal solution at x = | | [0. 2. 0. 1.]. | | | | Basic indices: {1, 3} | | Nonbasic indices: {0, 2} | | | | Optimal cost: -2.0. | --------------------------------- 4 iterations in phase I, 1 iterations in phase II (5 total). (0, array([0., 2., 0., 1.]), -2.0, None)

To assist reviewers, could you include an example of how to call your code for a simple scenario with, say, three variables and a couple of constraints? — 200_success, CommentedNov 14, 2018 at 23:35

l0b0 · Accepted Answer · 2018-11-14 22:46:38Z

Some suggestions:

Basically, wherever you have a comment you should consider renaming or encapsulating to avoid the need for that comment. For example:
- Rename A to something like constraint_matrix or constraints.
- Rename m and n to something like row_count and column_count.
- Encapsulate assert ext_I == 0 in a method like assert_phase_1_limited_optimal_solution_exists.
Remove any unused parameter defaults such as simplex_core's rule.
rule seems to be an enumerated set of "magical" values. A minimal improvement on that would be to use constants (or actual enums) declared globally for those values. Then both the implementation and users can use the constants without ever wondering about their values. Even better would be to point those constants to the functions relevant for each rule, so that you can do something like rule_handler(…).
Rather than returning a tuple I would return an object corresponding to a calculation result.

Thank you for your suggestions! If we were to use enums, should we declare an enum like class Rule(Enum) and "require" the rule parameter to be a member of that enum? — Anakhand, CommentedNov 15, 2018 at 11:53
Currently you basically assert rule in [0, 1]; with an enum I would assert rule in [SOME_ENUM, OTHER_ENUM], notassert isinstance(rule, MyEnumClass). Basically, the latter isn't duck typey, and is a less strict check. — l0b0, CommentedNov 16, 2018 at 6:52

Stack Exchange Network

Simplex method (linear programming) implementation

Usage example

1 Answer 1

Hot Network Questions

Simplex method (linear programming) implementation

Usage example

1 Answer 1

Related

Hot Network Questions