MATH 240

Mon. December 2nd, 2019


Least Squares (cont.)

From last Monday:
The set of least squares solutions of Ax=bAx=b coincides with the non-empty solution set of ATAx=ATbA^TAx=A^Tb.

Theorem. Let AA be an m×nm\times n matrix. The following statements are equivalent:

Note that we can’t say for sure that AA is invertible because it’s not necessarily a square matrix.

When these statements are true, then the least squares solution is given by (ATA)1ATb(A^TA)^{-1}A^Tb.


Theorem. Given an m×nm\times n matrix AA with linearly independent columns, let A=QRA=QR be a QRQR factorization of AA. Then for each bb in Rm\R^m, the equation Ax=bAx=b has a least squares solution given by x^=R1QTb\hat{x}=R^{-1}Q^Tb.


Application of Least Squares

Line of Best Fit

If we have a set of data points (x1,y1),(x2,y2),...(xn,yn)(x_1,y_1), (x_2,y_2), ... (x_n,y_n), we want to be able to find a line of best fit for this data set of the form y=β0+β1xy=\beta_0+\beta_1x.
To do this, we have to solve for β0\beta_0 and β1\beta_1 so that the sum of the distances from each data point to our line is minimized.

The input of the iith data point is xix_i, the predicted value from our line is β0+β1xi\beta_0+\beta_1x_i, and the actual value is yiy_i. We can format this idea as a matrix equation by using the matrices

X=(1x11x2......1xn), β=(β0β1), y=(y1y2...yn).X=\begin{pmatrix} 1&x_1\\1&x_2\\...&...\\1&x_n \end{pmatrix},\ \beta=\begin{pmatrix} \beta_0\\ \beta_1 \end{pmatrix},\ y=\begin{pmatrix} y_1\\y_2\\...\\y_n \end{pmatrix}.

Now we just need to find a vector β\beta such that XβX\beta is as close to yy as possible, i.e. a least squares solution to the equation Xβ=yX\beta=y.

(Note that we could only find an exact solution to this equation if all the data points were already on one line.)

As we showed before, this is equivalent to the exact solution β\beta of the equation XTXβ=XTyX^TX\beta=X^Ty, i.e. β=(XTX)1XTy\beta=(X^TX)^{-1}X^Ty.