Original Source Here
Building an iterative solver for linear optimization under constraints using geometry
A powerful tool for the data scientist
Optimization is a common tool in data science, and one of the most frequently used approach is the linear one.
Even though not every optimization problem can be formulated in a linear way, in many cases, it’s possible to rewrite it using a linear formulation.
In this post, we are going to write from scratch a solver for linear optimization under constraints, using an iterative method.
There are three main things to consider when formalizing an optimization under constraints:
- A set of variables: those are the quantities that we want to determine during the optimization.
- An objective: it’s a formula that combines the variables and expresses some value that we want to either maximize or minimize. As we have restricted ourselves to the case of linear optimization, this objective must be a linear combination of the variables.
- A set of constraints: These constraints will restrict the possible values for the variables. Once again, these constraints must be expressed using linear formulas.
As an example, the variables can be the height
W, and the length
L of a rectangular cuboid. The objective can be the sum of the dimension of this solid :
W + H + L , and we can add some constraints on the variables, like
W < 3 and
L < 4.
Mathematically speaking, this can be written as:
Solving this problem is pretty easy. It can be done manually. The solution consists in maximizing
L, and deduce
H, i.e. :
W = 3,
L = 4 and
H = 4, hence the sum of the cuboid dimensions is
S =3 + 4 + 4 = 11.
Visualizing the problem
Before going into the mathematical details, let’s try to graphically visualize our problem so that we get some insight.
As we are facing a 3-dimensional problem, i.e. we have three variables :
W, L and H , we can use 3D geometry.
The first two constraints are easy to visualize. The first one, restrict the solution space to the 3D space where all 3D points are below the plane defined by
W = 3. The second one restricts the solution space to all the 3D points that are below the plane defined by
L = 4.
The third one is a bit more complex to handle, but if we rewrite it using the following mathematical formula:
<(W, L, H), (1, 1, 0)> <= 8, where the operator
<., .> is the dot product, it appears that this constraint projects the dimension of the cuboid onto the vector
(1, 1,0), and that the sum of the component of this projection must be below 8.
Another way to interpret this projection with a minimal distance is to consider this vector
(1, 1, 0) as the (unnormalized) normal vector to a plan. Hence, once again, this third constraint can be interpreted geometrically as a plan.
The figure below illustrates this point, by drawing the plans as a circle with a normal vector.
In the figure above,
A = (3, 0, O) and the normal vector
u = (1, 0, 0)materializes the first constraint
W <= 3;
B = (0, 4, 0) and
v = (0, 1, 0) materialize the second constraint, whereas
C = (0, 4, 4).
Solving the problem
An iterative approach
Let’s first imagine that we are trying to optimize the objective
W + H + L without constraint. Even though this objective is linear with respect to
W, H and L, we can nonetheless use the steepest descent method.
In a linear case, without constraint, using this kind of method is not relevant, all the more that there is no solution as the system is unbounded. However, we will see in the next section how to integrate constraints in this iterative approach.
This steepest descent method is simple and purely geometric: the idea is to compute the gradient of a function, and move slightly in the direction of the gradient. Here the gradient is constant and is the vector
(1, 1, 1). The code below illustrates this method:
In this code,
weights contains the gradient, as in a linear formula, the gradient is simply the weights of the linear combination.
x_0 is the initial value, and
delta defines the size of the step.
Note that the stopping criterium is based on convergence. In this case, without constraint, the system will never converge. Objective will infinitely increase.
Detect unsatisfied constraints
At this step, what we need to be able to do is determine if a point is below a plane. A plane can be defined with a point
A and a normal vector
n. By convention, we’ll say that a point is above a plane if it’s located in the half-space pointing in the direction of the normal vector. On the opposite, a point is considered under a plane if it’s in the other half-space.
The figure below illustrates that with the plane defined by
W = 3:
Given these two pieces of information, it’s possible to know whether a point is below or above a plane with calculations involving the dot product.
Let’s consider the two points
Ab in the figure below :
Remember that when doing a dot product between a vector and a normalized vector, the resulting scalar will be positive if both vectors point in the same direction, whereas it will be negative if they point in opposite direction.
Hence, in order to know whether
Ab is below the plane or not, it is only necessary to perform a dot product between the normal vector
n and the vector
AAb that connect
As can be seen in the figure above, these two vectors point in the opposite direction, hence
Ab is below the plane as defined here.
On the opposite
Aa is above the plane.
The snipped below explains how to do that in python using NumPy:
With this formal definition of linear constraints as a plane, defined by a point and a normal vector, we have the necessary tool to detect unsatisfied constraints.
What we need now is a way to enforce constraints. Geometrically speaking, this means that we need a way to move the point considered onto the plane. I.e. we need to project the point on the plane.
Once again, the dot product can help us. As mentioned above, the sign of the dot product between the normal vector
nand the vector connecting
A and the point of interest
P inform us of the position of
P relatively to the plane.
But more importantly, if the normal vector
n has been normalized, this dot product gives us the distance between the plane and the point
P. Knowing that, projecting
P onto the plane simply boils down to move
P in the opposite direction of the vector
n by the quantity given by the dot product.
A picture is worth a thousand words, so let’s have a look at the figure below:
P' has been obtained by projection
P on the normal vector. Projecting
P to get its projection
Proj is the simply done by translating P using the vector
Written in python, this gives:
Please note this function
project is really a projector mathematically speaking, as
project(project(P)) is equal to
project(P). This is the mathematical definition of a projector.
Putting it all together
We now have three tools at our disposal:
- An iterative method to converge to an optimum
- A function to detect that a constraint is well respected
- A projector to force the respect of the constraint if necessary.
The algorithm to maximize our objective under constraints is pretty simple:
- We use the steepest descent to move toward the optimum, by following the gradient
- We ensure that constraints are respected
- If not, enforce the constraint
- Iterate until we have converged, i.e. until the point stops moving.
In python, this gives:
To ease the description of problems, we have introduced a
Constraint object, that defines a constraint with a vector, a point, and a gap. The gap indicates the distance of the constraint with respect to the plane.
We also introduced a function,
normalize_constraint, to ensure that the normal vector is unitary, i.e. we ensure that its norm is 1.0.
As you can see, this basic solver converges toward the optimal objective and propose the expected solution.
Ok, this works in 3D, but what about higher dimensions?
Up to now, we have been using 3D geometry, to ease understanding. However, everything we have done above can be generalized immediately to any n dimensional spaces, where
n ≥ 3.
This can be achieved by seeing that constraints can be seen as hyper plans of the
n-dimensional space, i.e. as
n-1 dimensional objects.
If for instance, we work with an hypercube of dimension 4, instead of our 3D cuboid, we get the following code, where only the problem definition has changed:
We have added another dimension:
Z , and put a simple constraint on it, i.e.
Z <= 5. Hence the maximal objective is now
3 + 4 + 4 + 5 = 16 , which the solver does find.
We have presented in this article one geometrically and easily understandable method for solving a linear problem under constraint.
If you look at the literature on the subject, you’ll see that there are many other ways to solve this kind of problem. In particular, there is another class of methods, based on the simplex method that works completely differently.
This kind of method has some advantages, one of them being the accuracy, as on the opposite to the method presented here, it does not converge to the solution but finds it.
It’s also a class of methods that is less subject to ill-conditioned problems.
However, iterative methods such as the one presented here have the advantages of being more efficient and allowing to control a trade-off between accuracy and time consumption.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot