Lagrangian Multipliers

Lagrangian Multipliers are a powerful tool for optimization. Using the SymPy package, I've provided a few examples of how we can perform interactive optimizations with Julia.

Note that I say interactive because the code example below does not offer a dynamic way to plug in some initial values and yield some result, but rather is an example of using the Julia language to do real-time computations.

In the first example, we want to find the minimum and maximum values of the function, \(f(x, y) = x^2 + x + 2y^2\) on the unit circle, \(x^2 + y^2 = 1\).

With Lagrangian multipliers, we are evaluating the equation,

\[ \nabla f(x_1, x_2, \dots, x_n) = \lambda \nabla g(x_1, x_2, \dots, x_n)\,. \]

This will result in a system of equations where each equation has the form of \(f_k = \lambda g_k\).

Since we will be optimizing \(f(x, y)\) over the unit circle, \(g(x, y)\) will be defined as \(g(x, y) = x^2 + y^2\).

We do some initial setup, including the SymPy package, establishing some symbolic variables, and defining our functions and region.

using SymPy

@vars x y λ

f(x, y) = x^2 + x + 2y^2
g(x, y) = x^2 + y^2

unit_circle = Eq(x*x + y*y, 1)

The next step in setting up the problem will be to write out our system of equations. SymPy offers us the Eq object for defining equations. Since we're working with the derivatives, we use the diff function which takes a function and a variable that we will be differentiating with respect to.

Once we've setup the problem by preparing an array of equations, we can use the solve function to yield possible results.

eqns = [
    Eq(diff(f(x, y), x), λ * diff(g(x, y), x)),
    Eq(diff(f(x, y), y), λ * diff(g(x, y), y))]

sol = solve(eqns, [x, y, λ])

x1 = sol[1][1]
y1 = sol[2][2]

(sy1, sy2) = solve(unit_circle(x => x1), y);
(sx1, sx2) = solve(unit_circle(y => y1), x);

Above, we've used solve to find the solutions and then pulled out individual cases for \(x\) and \(y\). We subsequently plugged those values into our unit_circle region, substituting \(x\) or \(y\) and solving for the other. Each equation is second order, therefore each yields two solutions.

Using the solutions we've allocated, we can determine the critical points which we put into an array of tuples.

critical_points = [
    (x1, sy1), 
    (x1, sy2),
    (sx1, y1),
    (sx2, y1)
]

fx_cp = [ Float64(f(cp...)) for cp in critical_points ]

(minimum(fx_cp), maximum(fx_cp))

Leveraging a list comprehension, we plug in each tuple into our original \(f(x, y)\) function and transform the result from symbolic into a datatype of Float64.

Finally, we use the Julia functions minimum and maximum which take an array as input and result the minimum or maximum values of the array.

Putting everything together, here is the code we used to perform the optimization.

using SymPy

@vars x y λ

f(x, y) = x^2 + x + 2y^2
g(x, y) = x^2 + y^2

unit_circle = Eq(x*x + y*y, 1)

eqns = [
    Eq(diff(f(x, y), x), λ * diff(g(x, y), x)),
    Eq(diff(f(x, y), y), λ * diff(g(x, y), y))]
sol = solve(eqns, [x, y, λ])
x1 = sol[1][1]
y1 = sol[2][2]

(sy1, sy2) = solve(unit_circle(x => x1), y);
(sx1, sx2) = solve(unit_circle(y => y1), x);

critical_points = [
    (x1, sy1), 
    (x1, sy2),
    (sx1, y1),
    (sx2, y1)
]

fx_cp = [ Float64(f(cp...)) for cp in critical_points ]

(minimum(fx_cp), maximum(fx_cp))

A couple things worth mentioning! Whenever possible, use the functions made available by the Julia base or standard libraries. While it may seem simple enough to roll your own in this or that case, you will find that nine out of ten times, the provided Julia functions are significantly faster. You should only resort to custom-built tools when absolutely necessary – meaning that what you need doesn't currently exist.

The second thing is that we wouldn't just use the above code to solve any problem. If you're following along with the REPL, check out the results that are provided by each individual function call and try to understand why we are using those values in subsequent steps. Especially with optimization problems, it's just not possible to blindly apply some programmatic solution to every problem. For example, if we were looking for the maximum volume of a box given some constraint, we may find ourselves with a value of \(x = 0\) or \(y = 0\). Clearly we can toss out that result since it just doesn't make any sense. You can't have a box where one of the sides has a length of \(0\) – even a sheet of paper has some breadth, regardless of how thin it happens to be.

In closing, hopefully this short post shows how easy it is to perform computations and achieve results quickly with Julia. As you read through other posts available on this blog, the range of possible problem sets that can be addressed with Julia will become very apparent.

Here are two more examples without robust commentary.

Optimize \(f(x, y) = y^2 - x^2\) over the region defined by \(x^2 + 4y^2 \leq 4\). One comment is that since I couldn't programmatically apply the inequality of the region, I still used Eq and simply kept in mind what I was working with and that the points must be less than or equal to \(4\).

@vars x y λ

f(x, y) = y^2 - x^2
g(x, y) = x^2 + 4y^2

region_boundary = Eq(g(x, y), 4)

eqns = [
    Eq(diff(f(x, y), x), λ * diff(g(x, y), x)),
    Eq(diff(f(x, y), y), λ * diff(g(x, y), y))]

sol = solve(eqns, [ x, y, λ ])

# filter out x = 0 and y = 0 solution
sol = filter(x -> !(x[1] == 0 && x[2] == 0), sol)

x1 = sol[1][1]
y1 = sol[2][2]

(sy1, sy2) = solve(region_boundary(x => x1), y)
(sx1, sx2) = solve(region_boundary(y => y1), x)

critical_points = [
    (x1, sy1), (x1, sy2),
    (sx1, y1), (sx2, y1)
]

fx_cp = [ Float64(f(cp...)) for cp in critical_points ]

(minimum(fx_cp), maximum(fx_cp))

Find the maximum area of a rectangle with sides of length \(x\) and \(y\) where the perimeter is equal to \(14\) units.

@vars x y λ

f(x, y) = x*y
g(x, y) = 2x + 2y

region = Eq(g(x, y), 14)

eqns = [
    Eq(diff(f(x, y), x), λ * diff(g(x, y), x)),
    Eq(diff(f(x, y), y), λ * diff(g(x, y), y))]
sol = solve(eqns, [x, y, λ])

(λ1, ) = solve(region(x => sol[x], y => sol[y]), λ)

(y1, ) = solve(eqns[1](λ => λ1), y)
(x1, ) = solve(eqns[2](λ => λ1), x)

# there is no minimum
(nothing, f(x1, y1))