Calculus: Project 32

Project 32: Least Squares Fit and Max-Min

Previous project Next project

These questions have you complete the development of the least squares fit of data.

32.1 Introductory Example

The basic idea of least squares fit is this. We are given data points

for example, (1,6), (2,9), and (3,10). We know that these represent the output of a linear function

with errors introduced by measurement. We do not know the parameters m and b but wish to estimate them so that they produce the minimal sum of squared errors. What are the squared errors?

Suppose we had chosen a value for m and a value for b. If we compute values using the formula mx_i+b and subtract the measured value of y_i, we obtain the error at one point,

At the input x₂=2, we would compute output

and compare that with y₂=9; the difference is

. In this example, the sum of all these actual errors is

This straight algebraic sum has the disadvantage that a huge positive error and a huge negative error cancel, so that the sum does not represent the "total error." (We do not want two huge canceling errors to be considered small.) The square of an error, like

, is always positive and in addition favors small errors,

so the sum of the squared errors:

is a measure of overall error that does not cancel errors of opposite sign.

We want to consider the parameters m and b as the unknowns of our problem, so we seek to

and specifically, we wish to minimize

Figure 32.1: Least Squares Fit

In this problem we want the minimum, but mathematically, we might also ask for the maximum. Does the function E[m,b] have a maximum value over all real values of m and b? Why? Show that if either or is a huge positive or negative number, then E is huge.

32.2 The Critical Point

The exercise above shows that E[m,b] has no maximum. It does have a minimum, but before we examine the reason, we seek possible locations for the min, that is, critical values or places where the gradient vector is zero.

This is a system of two linear equations in the unknowns m and b,

or in the more compact matrix notation

where a=(1²+2²+3²)=14, c=d=(1+2+3)=6, e=3,

and g=6+9+10=25. The unknowns are m and b.

We have just shown that if and only if , with the particular matrices,

The unique solution of these particular linear equations is . Verify this by a hand computation.

Why is the solution of the critical point equations the parameter values of minimum total square error? In other words, how do you know that there is a minimum pair of values and not just better and better choices? Why is this point the minimum one? Use the Extreme Value Theorem and Critical Point Theorem, for example, to answer the question.

32.3 The General Critical Equations

We want you to derive the general formulas for minimizing m and b. Suppose that n pairs of (x,y) values are given,

Show that if and only if the unknowns (m,b) satisfy the equations ,

where a=(x₁²+x₂²+x₃²+...+x_n²), c=d=(x₁+x₂+x₃+...+x_n), e=n, , and g=y₁+y₂+y₃+...+y_n

Notice the n-dimensional norm and dot product in these formulas!

Write a computer program to find the least squares fit of the data (1,6), (1.2,7.1), (1.3,7.2), (1.35,7.2), (1.41,7.5), (1.5,8), (1.53,8.2), (1.6,8.9). Use the program LeastSquares on our website for help.

In the example with the data (1,6), (2,9), and (3,10) and least squares fit , show that the sum of the signed errors is zero:

In other words, the amount above and below the fit line is the same.

Show that the sum of the signed errors is zero in the general case.

Previous project Next project Close this window