Chapter 19: Several Variable Optimization - CD

Previous chapter Next chapter

Chapter Summary
This chapter applies the differential approximation to max-min problems in several variables. It appears only on the CD.

The differential approximation in two variables is

where is small when and are small. Replacing the nonlinear change

by its linear differential

is equivalent to "looking in a powerful microscope" and working on the scale of the microscope. That is, when we hold fixed, the partials are constant and has the form of a linear function in the local variables dz=mdx+ndy. At magnification , the error term appears the size of and when this is smaller than a screen pixel, both graphs look like the same linear tangent graph.

Section Summary
A max or a min cannot occur at a sloping part of a graph unless you are stuck at the edge. This section rephrases this obvious remark and uses that to give a mathematical condition of where we might find a max or a min.

Suppose we are wondering whether a function f[x,y] has a maximum or minimum at a point (x,y). We have a way to "see" what the graph looks like in a powerful microscope. When we look in the microscope, the linear graph typically looks like a sloping plane. If we can move in any direction on a sloping plane, at least a little, then we can increase or decrease the function. In other words, we cannot be at either a maximum or a minimum. This is the proof of the Interior Critical Point Theorem, stated below. It is an "interior" result because if we are at a boundary where we cannot move in some directions, then we might not be able to make the quantity larger or smaller, depending on which way we cannot move. "Looking" in a microscope amounts to approximating the change in the nonlinear function corresponding to a perturbation vector by its differential

The differential is linear in the perturbation variables when (x,y) is fixed. The "slope" of the plane dz=mdx+ndy is and it tilts upward at this rate in the direction of and downward at this rate in the direction . (See Theorems CD-17.1.1 and CD-18.)

What have we learned? If the plane slopes and we can move, we are NOT at a max or a min. What else is there? The plane might not slope. This happens if the gradient is zero, so the slope is zero and the tangent plane is horizontal.

This is a negative result because it only rules out interior points where from being maxima or minima. It does not say where maxima or minima are located or even whether there are any.

Theorem CD-19.1 Interior Critical Points in 2 - D If z=f[x,y] is defined on a plane set of (x,y)-points D and f[x0,y0] is an extremum of f for D where there is at least a tiny rectangle about (x0,y0) inside D, , then

(We use the logical contrapositive, , in our statement of the result. We have shown that if () and we can move in any direction, then we are not at a max or a min (). This is logically equivalent to the implication , which is the way we state the theorem. See Section CD-11.2.)

Here are some examples and their graphs , z=3x-x3-3xy2, z=6xy2-2x3-3x4. Find the points where the tangent plane is horizontal, and notice that the horizontal tangents need not be maxima or minima.

19.1.1 Critical Points on a Contour Plot

The critical point result can also be understood in terms of contour graphs. First, let us look at contour plots of the four examples above to see how the places with horizontal tangents appear on the contour plots.

The contour graph of dz=mdx+ndy consists of the family of lines in (dx,dy)-space obtained by setting dz equal to various constants. That linear contour graph is a local approximation to the nonlinear contour graph of . We saw this in the Microscope program, but the degenerate case corresponding to a horizontal explicit tangent plane is a little harder to understand in contours.

If k, m, and n are constants, when does the equation

define a line in (dx,dy)-space? You might say, "It is always a line perpendicular to the vector and a distance from the origin." This is true unless m=n=0. Vector is the only "bad" case. [Note that the cases where one of m=0 or n=0 correspond to a vertical line dx=k/m () or a horizontal line dy=k/n ()].

When and both are zero, the tangent plane to the nonlinear explicit graph is a horizontal plane, the explicit graph of dz=mdx+ndy, namely dz=0. A horizontal plane has only one contour level. If we were to plot its contour map, it would either be blank or all black, depending on whether or not we plotted the contour level dz=0. We have dz=0dx+0dy, so the level dz=0 is all black and other levels are blank. This is highly degenerate but very interesting to our local analysis. It is a degenerate point where the function does not change "to first order."

If the vector is given by

By Theorem CD-18.5, we know that f increases fastest for small perturbations in the direction . We know f decreases fastest in the opposite direction, . Therefore, if we are allowed to change in these directions (by at least a small amount), the point where cannot be a max or a min. This is the gradient vector proof of the Critical Point Theorem.

Exercise set CD-19.1

Use the computer to plot the surface of z=4xy(x2-y2) for . You could even use the program FlyBySurfaces. Use the computer to check your computation of and as well as to solve the simultaneous equations

and see that the only flat point on the surface is at (0,0).

Where are the highest and lowest points on the dog saddle - just visually?

There are four depressions on this "saddle" for a dog's legs.

Section Summary
The next result is a "positive" result; it says a max and a min exist.

The Critical Point Theorem tells us where not to look, but the Extreme Value Theorem says that the thing we are looking for is there. Hence, we look at the places that are left, critical points, and noninterior points.

Theorem CD-19.2 Extreme Values in 2 - D Let be a smooth function defined on a "compact" domain D in the (x,y)-plane. Then there are vectors and so that for all in D

The function f attains its minimum at and attains its maximum at .

The big question is, "What does `compact' mean for a region in the plane?" In one variable, the analogous hypothesis was that we were seeking a max and a min over an interval that was bounded (or did not extend to infinity in either direction) and that contained its endpoints. In other words for two real numbers a and b.

Definition: Compact Planar Region Let D be a set of points in the plane, a "region." We need two conditions in order to have a compact region in the plane. First, we need a bound on the total size of the vectors in the region. Second, we need to "include the boundary" of the region. Technically, this means that if is a convergent sequence of vectors in the region D, then the limit point lies in D.

In practice, the problems we consider ask for maxima and a minima over sets like triangles, circular sectors, quadrants. These regions are defined by inequalities and their non-interior, or boundary, points are defined by equations.

Example CD-19.1 A Triangular Region

The inside of a triangle together with its edges is compact because sequences inside tend to points inside or on the edge and because the triangle is "finite in extent." In other words, we can find an upper limit on the size of all vectors that lie inside the triangle. See Figures CD-19.2 and CD-19.4. They are defined by inequalities such as , , .

Example CD-19.2 A Circular Region

The region in Figure CD-19.6 is compact. It is defined as the set of pairs (w,h) such that , and .

Example CD-19.3 A Noncompact Unbounded Region

A quadrant given by the inequalities and is not compact, even with its boundary edges because we can find arbitrarily large vectors in the quadrant - it extends to infinity in some directions. The function is smooth, but has no maximum on any region with no bound on the size of its points.

Example CD-19.4 A Noncompact Region with a Hole

The plane less a single point is not compact for two reasons. First, the region extends to infinity as you move farther and farther from zero. Second, a sequence of vectors tending toward the excluded point converges but not to a point in the region. Notice that the function is smooth but has no maximum on the plane less the origin.

Exercise set CD-19.2

1. Sketch the set of points

Explain why it is compact. What is an upper bound on the size of points in D? Why does a convergent sequence of points from D tend to a point already included in D? Sketch the set of points

Explain why it is NOT compact by finding one sequence of points in the region whose lengths tend to infinity and another sequence of points in the region whose limit exists but is not in the region.

2. (Computer Exercise) See the "Going to Extremes" section of the program MxMnThy.

Section Summary
The combined effect of the two previous theorems is a plan of attack.

Procedure Finding Maxima and Minima in Two Variables
To find the extrema of a smooth function z=f[x,y] on a closed and bounded or compact subset of the (x,y)-plane:
• 1. Sketch the region D, finding the interior and boundary.
• 2. Compute and and then find all simultaneous solutions in x and y to the pair of equations

such that (x,y) lies inside D. These are the interior critical points where the gradient vector is zero,

• 3. Find the max and a min of on the boundary of D. (This is often a new 1 variable max-min problem.)
• 4. Find the largest and smallest value of the points from (b) and (c).

Parts (b) and (c) of this procedure look pretty scary, but we can use the computer to help us solve simultaneous equations. The important thing is to have a clear idea of what we want to do rather than worry about how we are going to do it. We begin studying max-min theory with some simple examples for which we do not need the computer's help with the equations. The first example can be graphed and approximately maximized by visual inspection. The bothersome negative "volumes" that spoil our graphs are the key to making this optimization a compact region problem.

Example CD-19.5 An Interior Maximum

A post office regulation says that a package must have the sum of the length plus the girth less than or equal to 108 inches. What is the maximum volume for a rectangular box meeting these requirements? The volume of a rectangular solid is V=lwh, where l is the length, w is the width, and h is the height (which we will measure in inches). The postal regulation says

and since more is bigger, we will have 2(w+h)+l=108 at maximal volume. This means we can eliminate a variable and make this a 2 dimensional independent variable problem:

Figure CD-19.1: V=wh(108-2(w+h))

What is the region of the independent variables (w,h)? Clearly, and . If 2(w+h)=108, then there is no more room for length, so our compact region is the triangle:

A sketch of the line 2(w+h)=108 or w+h=54 shows that D is the triangle in the first quadrant of the (w,h)-plane below the line perpendicular to (1,1) and through the points (54,0) and (0,54).

Figure CD-19.2: Region and boundary

Our complete mathematical problem is

We may apply the compact region max-min Procedure CD-19.4:

Step a) This is the triangle sketched in Figure CD-19.2. The boundary consists of

The segment of the w axis,

The segment of the h axis,

The line segment

Step b) Compute the partial derivatives and find simultaneous zeros.

So we must find all interior solutions to the pair of equations

This is done in the the computer program SvVarMxMn but is simple enough to do ourselves. One solution is w=h=0. If h=0 but , then the bottom equation gives us the condition

so w=54. Similarly, if w=0 but , then h=54. We have found three solutions: . None of these is interior to D because they are the vertices of the triangle bounding D.

The final solution (and the only interior one) is when and , so we can divide the original equations by these variables, thereby obtaining the simultaneous linear equations

dividing the first by -2, we obtain

back substituting, we obtain

and our only interior solution is

It is tempting to jump to the correct conclusion that the maximum volume is attained when w=18, h=18, and l=(108-2(18+18))=36, so V=11664. But remember that the mathematical theory looks for both the max and the min. Steps (c) and (d) of our max-min procedure are easy in this case and they complete the reasoning.

Step c) The max and a min of on the boundary is easy to compute.

On the edge of the triangle, , V=0. On the edge of the triangle, , V=0 again. On the edge of the triangle, , l=0, so again V=0. The max and the min of zero is zero.

Step d) The two values we must compare for V are V=0 and V=11664.

Clearly, the max is 11664, and the min is 0. These values are easily seen on Figure CD-19.3.

Figure CD-19.3: V plotted only over D

Example CD-19.6 A Maximum on the Boundary

Let us change the previous mathematical problem to

so that the boundary conditions are no longer trivial. The critical point equations

still only have the solutions (0,0), (54,0), (0,54), (18,18). But now only (0,0) lies in D, and that point is on the boundary.

Figure CD-19.4:

We also still have V=0 on the w-axis and on the h-axis, so the boundary we need to check is

Substituting this in the formula for V, we obtain

along h=50-2w, for .

Finding extrema on this part of the boundary of D is the mathematical problem

At the endpoints of this one-variable extremum problem, we have Vb(0)=0 and Vb(25)=0. In the interior of the restricted problem, we have critical points when the derivative,

is zero. This only occurs when (for ), where . This is the maximum on the boundary and since there are no critical points inside this new region, this is also the overall global maximum. The minimum is still zero. The max and the min can be seen on the boundary of Figure CD-19.5.

Figure CD-19.5: V plotted over

Exercise set CD-19.3

1. Find the max and the min

Section Summary
Implicit differentiation often simplifies problems with constraints.

Implicit differentiation amounts to treating all variables equally at the differentiation stage. In Examples CD-7.2 and CD-7.3, we computed the tangent line to a circle using the implicit equation for the circle and implicit differentiation. (In Section CD-18.6, we found the tangent to an implicitly defined surface.)

When we use the implicit equation for the differential, the x, y, and z must satisfy the original implicit relation, g[x,y,z]=c. We will use this in constrained max-min problems.

Procedure Finding Critical Points with Constraints
To find the critical points of w=f[x,y,z] when z is constrained by g[x,y,z]=c, a constant:
• 1. Compute and .
• 2. Solve the constraint differential for dz.
• 3. Substitute this expression for dz into the expression for dw from part 1. Simplifying yields

but the partial derivatives may have the dependent variable z in the expressions. We cannot solve the equations and in this form because of the extra variable, so we add the constraint equation.
• 4. Solve the simultaneous equations

Example CD-19.7 Implicit Maximum

We might as well flog the poor post office example to death by considering the quarter-circle problem

Figure CD-19.6:

This is the same function, but it has a new outer boundary, the circle of radius 25 centered at (0,0). Because the vector (18,18) has length , there are again no interior critical points. V is still 0 on the axes and the new problem only requires us to extremize on the quarter-circle boundary.

Rather than solve for h in terms of w, we use implicit differentiation to compute the critical points .

because the total differential of V is

and the differential of the constraint equation is

Solving the constraint equation for dh gives , which we substitute into the expression for dV, yielding

so

The computer finds the simultaneous solutions to the equations

to be plus three real solutions outside the region and two complex solutions. At this critical point solution, we have . This is the max on the circular boundary and also on the whole region because V=0 on the rest of the boundary (including the ends of the quarter-circle). The max and the min are aparant over the boundary on Figure CD-19.7.

Figure CD-19.7: V plotted over

See the program SvVarMxMn for a computer solution of this problem.

The moral of the various parts of this example is that the boundary extrema are more difficult in two dimensions; in fact, they become difficult 1-dimensional extremization problems. All these variations had compact regions, D.

The quarter-circle boundary max can be done with explicit formulas or with parametric equations. The implicit method is the one we want to stress, but you can compare it to the others in Exercise CD-19.4.2.

Exercise set CD-19.4

1. (Computer Exercise) Find the max and the min subject to the constraints given.

See the program MxMnThy.

2. (Computer Exercise) In the Example CD-19.7 above, we used implicit differentiation to find the critical point on the circular boundary. Here are two alternative methods to find the boundary extrema:

Explicit Equation

Solve w2+h2=625 for and substitute this into the formula for V,

Now, solve the one-variable max-min problem

Parametric Equation

The vector parametric equation for the quarter circle is

Substitute this into the formula for V,

Now, solve the one-variable max-min problem

Which approach is easiest? What else would you have to do if we asked for the max and a min on the whole circle w2+h2=625? See the program SvVarMxMn.

Section Summary
When we seek a max or a min on a noncompact domain, we have to use additional reasoning. There may not even be maxima or minima.

The next example asks for a minimum of a function that is not defined on a bounded region. The Extreme Value Theorem does not apply, so there are no guarantees. There is a min but no max. You should be able to explain why there is no maximum area.

Example CD-19.8 A Min with No Max

We wish to make a rectangular box with an open top of unit volume (). To minimize cost, we wish to minimize the combined surface area of the sum of the three sides and the bottom. Let x be the width of the box, y be the length, and z be the height (all in meters.)

Figure CD-19.8: An open-top box

Unit volume means V=lwh=xyz=1, or . The area of the bottom is xy, one side xz, another side yz, and there are two of each type of side, thus the total area

and we have the mathematical problem

The region for minimization is neither closed nor bounded, so there is no mathematical guarantee that we will find the min. Intuitively, it is clear that there is a min, however, and it is also clear that there is no max. (Can you show this? Remember the math professor who tried farming in Exercise CD-11.3.1? What if you make a very large bottom but very, very short sides?)

The partials are

have the simultaneous solution

so,

WARNING: We need some sort of additional special reasoning to know that this is the minimum. In this problem, we know intuitively that we can build an efficient box. When we have only formulas, this may not be so clear, so we include the complete mathematical reasoning in Example CD-19.10, Example CD-19.13 and Problem 19.1. Sometimes a computer plot will help, but there could be scale problems in using a graphical min or max.

Finding the distance from a point to a surface is a noncompact domain minimization problem.

Example CD-19.9 The Distance to a Surface

Find the point on the surface z=x2+2y2 that is closest to (3,2,1). The vector from (3,2,1) to a general (x,y,z) is

and its length is . When (x,y,z) lies on the surface, z=x2+2y2, the distance becomes ; so this is the distance from (3,2,1) to a general point on the surface shown in Figure CD-19.9.

Figure CD-19.9: z=x2+2y2

We are asking to minimize the function over all values of x and y.

This is not a compact region for the (x,y)-domain since x and y are not bounded and a sketch will show that there is no maximum distance. No maximum also follows easily from the formula: if x=H is huge (even if y is small), the distance is huge.

Since the square function is increasing for positive values, it is easiest to minimize the squared distance

The total differential is

(as you should verify using small steps with the Chain Rule.) We use the computer to solve the simultaneous equations

obtaining the solution , and 4 complex solutions. (See the program SvVarMxMn.) This makes the minimum distance

It is clear that there is no maximum distance. A simple way to show this is to take a huge number for both x=H and y=H. Then, the squared distance is also huge.

Example CD-19.10 No Maximum Distance to the Paraboloid

We can use max-min theory to show that our critical point is a minimum. We concoct an artificially bounded max-min problem in which the max lies on the boundary of our concocted region and the min on the boundary is larger than our interior critical point. A sphere of huge radius H centered at (3,2,1) intersects our paraboloid at squared distance H2. The intersection is the curve satisfying both of the equations

So, we consider the problem

There is only one critical point interior to the region and the squared distance to the boundary is always the huge H2. This shows that the interior point is minimal for the artificially restricted problem. It is also minimal for the whole problem because points outside the concocted region are farther away.

The drawback to this solution is that the region given by cannot be solved and sketched so easily. Consider instead the simpler circular region of huge radius H given by

It is intuitively clear that the nearest point to (3,2,1) on the boundary x2+y2=H2 is still a huge distance from the point (3,2,1). Let us estimate it. The boundary minimum is more than the squared horizontal distance (x-3)2+(y-2)2, which occurs along the line through (3,2) in the x-y-plane, (3K)2+(2K)2=H2, so . This makes the minimal distance on this boundary more than the huge amount (K-3)2+(K-2)2. The Extreme Value Theorem applies to the problem:

which has its minimum at the interior critical point.

The next example is a geometric distance max - min problem that is over a compact region; but, as is typical of closed and bounded surfaces, the computations are best done implicitly. The implicit computations hide the closed bounded nature to some extent.

Example CD-19.11 An Implicit Closest Point

Find the points on the ellipsoid

that are nearest and farthest from the point (3,2,1).

Figure CD-19.10: 3x2+2y2+z2=20

The vector displacement from (3,2,1) to (x,y,z) is the same as in the previous example, but we cannot simply solve for z and substitute into the distance formula. Instead, we use the squared distance D with a general (x,y,z) and remember the constraint:

and by substituting

into the expression for dD, we have

The computer finds the solutions to the simultaneous equations

and and 4 complex solutions. These points yield the distances:

Example CD-19.12 Max-min of Volume with Fixed Surface Area

Implicit equations are very powerful, but it can sometimes be difficult to analyze compactness of the domain when implicit functions are used. We state the following symbolically and solve it before we explain a geometric way to look at it:

Let us use the previous procedure:

Now, substitute the constrained value of dz in the dw-equation:

Solution of the simultaneous equations, , , with the constraint is given by a computer command such as the following:

Solve[{y z (x + y) == x y (y + z),

x z (x + y) == x y (x + z),

x y + y z + x z == 1}]

Yielding only the one positive solution , where . Hummm. Is this the max or the min?

Along the boundaries x=0 or y=0, w=xyz=0, so is not the minimum. Is it the maximum? Some theory can help us. Consider the problem restricted to a huge rectangle.

This problem still does not satisfy the hypotheses of the Extreme Value Theorem, but the reason is disguised. The constraint xy+yz+zx=1 cannot be satisfied if both x=0 and y=0. More generally, z=(1-xy)/(x+y) is undefined on the line x+y=0 or y=-x. This line removes the point (0,0). Cut out a tiny corner near zero,

Example CD-19.13 Complete Reasoning for Max-Min of Volume with Fixed Surface Area

Figure CD-19.11: Region

This problem satisfies the hypotheses of the Extreme Value Theorem, so we know that it has a max and a min. The only interior critical point is , so we examine the edges.

We know that w=0 on the edges x=0 and y=0.

If and , then , and , so . In other words, w is small on the edges of the small corner.

Intuitively, if x=H is huge, y and z must be very small so that xy+yz+zx=1. The question is, "How big is w=xyz, when x=H is huge?" This requires some estimation:

Since Hy and Hz are positive and no more than 1, both y and z must be tiny. This also makes

because Hy is at most 1 and finite times tiny is tiny. (Alternately, let H=106, a million. Then Hy and Hz are no more than 1, so neither y nor z is more than one one millionth. This makes Hyz less than a millionth.) Similarly, x and z must be small if y=H, and the computation shows that . What have we shown? The extrema of w=xyz must be either at or along the boundary where . There must be both a max and a min and the value along the boundary is only tiny, so

when x and y are in the cut off square and xy+yz+zx=1. Outside this set, w is even smaller than it is on the large edges, so w is maximized for all positive x and y at the values . There is a geometric way to understand this problem. Consider a rectangular box with width x, length y, and height z. The area of the faces are xy, yz, and zx, and there are two of each of these; in other words,

so our constraint xy+yz+zx=1 means that the box has a fixed area of 1/2. We also have

and our question is

We have shown that a cube has maximum volume for an area of 1/2. The huge edge argument amounts to showing that when one side is very long, the volume of the box is small because you must keep the other two edges so small in order to hold the area fixed. Perhaps you can explain the geometry problem more intuitively, but there is a point to the max-min theory that we illustrate with the next subsection.

19.5.1 L. C. Young's Sailboat with No Fast or Slow Path

This is an example of a max-min problem with no max and no min. We wish to sail a sailboat up stream and against the wind on a river. We can sail at and make fast progress, but we must "come about" at the shore and sail on the opposite tack. Fine, this works; we make progress up stream. However, the current in the river is strongest near the middle and slowest near the shore. We would make faster progress if we came about before we reached the strong current in the middle, for example, suppose we sailed on a third of the river near the bank. Conversely, we would make slower progress if we stayed in the middle third of the river sailing against the fast current. There is no end to this. It would be even faster to sail on the ninth of the river nearest the shore and slower to sail in the ninth nearest the fastest current. The fastest path is to sail right against the shore and come about infinitely often. The slowest is to stay at the strongest current and come about infinitely often. Unfortunately, you can't come about infinitely often and this problem has no max and no min.

Exercise set CD-19.5

1. Find the max and a min subject to the constraints given.

2. Find the minimum surface area of a rectangular solid of fixed volume V.

Interpret your solution as the inequality

For the volume, V, of any rectangular solid of area A.

The maximal volume of a rectangular solid of fixed surface area A follows from this general inequality. Why?

3. Find the point on the surface z=x2-y2/3 nearest to .

4. A surface is given implicitly by the equation

Find the point on the surface that is closest to

Is there a point farthest from ?

5. When is the speeding target of Exercise CD-16.5.1 closest to you?

6. Find the volume of the largest rectangular solid with its faces parallel to the coordinate planes that can be inscribed in the ellipsoid . (HINT: Try the sphere first, if you are having trouble.)

We know that the square maximizes the area of all rectangles with a fixed perimeter. Which triangle maximizes the area of all triangles of fixed perimeter? Heron's Formula says the area for a triangle with sides a, b, c. We know p=a+b+c and all three of a, b, and c must be positive to make a triangle. In addition, if c is the long side, .

7. Find the triangle of maximum area among those with fixed perimeter.

HINTS for the triangle problem: Eliminate the variable c, using c=p-a-b. Since p is fixed, the area A=A(a,b) is a function of two variables once we substitute the expression for c into Heron's Formula above. Show that the geometry requires 0<a, 0<b and for a maximum. Sketch this (a,b) region.

Problem CD-19.1 Complete Solution to Example  This problem has you complete the mathematical reasoning for the minimum and lack of a maximum in the problem of finding a rectangular box with unit volume and a minimum area.

Example CD-19.8 shows that the geometric problem can be phrased symbolically as:

Introduce an artificial boundary by moving slightly inside the first quadrant and cutting it off at a very large value:

where and H is huge. Show that A is huge on the whole boundary by considering cases: (1) either or A is huge. (2) if 0<y and x=H, A is huge. etc...

Reason that the value of A at the interior critical value is minimum of the artificially bounded problem and that the maximum is huge. Why does this make the noncompact problem have no max?

The Mathematical Background project on least squares fit of a straight line to data uses max-min in two variables to find the fit. There is also a section on geometric distance problems that can be phrased as minimization problems.

Previous chapter Next chapter Close this window