From last Monday:
The set of least squares solutions of coincides with the non-empty solution set of .
Theorem. Let be an matrix. The following statements are equivalent:
Note that we can’t say for sure that is invertible because it’s not necessarily a square matrix.
When these statements are true, then the least squares solution is given by .
Theorem. Given an matrix with linearly independent columns, let be a factorization of . Then for each in , the equation has a least squares solution given by .
If we have a set of data points , we want to be able to find a line of best fit for this data set of the form .
To do this, we have to solve for and so that the sum of the distances from each data point to our line is minimized.
The input of the th data point is , the predicted value from our line is , and the actual value is . We can format this idea as a matrix equation by using the matrices
Now we just need to find a vector such that is as close to as possible, i.e. a least squares solution to the equation .
(Note that we could only find an exact solution to this equation if all the data points were already on one line.)
As we showed before, this is equivalent to the exact solution of the equation , i.e. .