If A A A is an m × n m\times n m × n matrix with linearly independent columns, then A A A can be factored as Q R QR Q R , where Q Q Q is an m × n m\times n m × n matrix whose columns form an orthonormal basis for Col A \text{Col}A Col A , and R R R is an n × n n\times n n × n upper triangular invertible matrix with positive entries on its diagonal.
Ex. A = ( 1 0 0 1 1 0 1 1 1 1 1 1 ) A=\begin{pmatrix}
1&0&0\\1&1&0\\1&1&1\\1&1&1
\end{pmatrix} A = ⎝ ⎜ ⎜ ⎜ ⎛ 1 1 1 1 0 1 1 1 0 0 1 1 ⎠ ⎟ ⎟ ⎟ ⎞
Step 1 : Use Gram-Schmidt to find an orthogonal basis for Col A \text{Col}A Col A . (We already did this for this set of vectors last Friday.)
Col A = span { v 1 , v 2 , v 3 } \text{Col}A=\text{span}\{v_1,v_2,v_3\} Col A = span { v 1 , v 2 , v 3 } where v 1 = ( 1 1 1 1 ) , v 2 = ( − 3 1 1 1 ) , v 3 = ( 0 − 2 3 1 3 1 3 ) v_1=\begin{pmatrix}
1\\1\\1\\1
\end{pmatrix},v_2=\begin{pmatrix}
-3\\1\\1\\1
\end{pmatrix},v_3=\begin{pmatrix}
0\\-\frac{2}{3}\\\frac{1}{3}\\\frac{1}{3}
\end{pmatrix} v 1 = ⎝ ⎜ ⎜ ⎜ ⎛ 1 1 1 1 ⎠ ⎟ ⎟ ⎟ ⎞ , v 2 = ⎝ ⎜ ⎜ ⎜ ⎛ − 3 1 1 1 ⎠ ⎟ ⎟ ⎟ ⎞ , v 3 = ⎝ ⎜ ⎜ ⎜ ⎛ 0 − 3 2 3 1 3 1 ⎠ ⎟ ⎟ ⎟ ⎞
Step 2 : Normalize these vectors and collect them into a matrix to form Q Q Q .
Q = ( 1 2 − 3 12 0 1 2 1 12 − 2 6 1 2 1 12 1 6 1 2 1 12 1 6 ) Q=\begin{pmatrix}
\frac{1}{2}&-\frac{3}{\sqrt{12}}&0\\\frac{1}{2}&\frac{1}{\sqrt{12}}&-\frac{2}{\sqrt{6}}\\\frac{1}{2}&\frac{1}{\sqrt{12}}&\frac{1}{\sqrt{6}}\\ \frac{1}{2}&\frac{1}{\sqrt{12}}&\frac{1}{\sqrt{6}}
\end{pmatrix} Q = ⎝ ⎜ ⎜ ⎜ ⎜ ⎛ 2 1 2 1 2 1 2 1 − 1 2 3 1 2 1 1 2 1 1 2 1 0 − 6 2 6 1 6 1 ⎠ ⎟ ⎟ ⎟ ⎟ ⎞
Because the columns of Q Q Q are orthonormal, Q T Q = I Q^TQ=I Q T Q = I .
Thus Q T A = Q T ( Q R ) = ( Q T Q ) R = R Q^TA=Q^T(QR)=(Q^TQ)R=R Q T A = Q T ( Q R ) = ( Q T Q ) R = R .
Step 3 : Calculate Q T A Q^TA Q T A .
R = Q T A = ( 2 3 2 1 0 3 12 2 12 0 0 2 6 ) R=Q^TA=\begin{pmatrix}
2&\frac{3}{2}&1\\0&\frac{3}{\sqrt{12}}&\frac{2}{\sqrt{12}}\\0&0&\frac{2}{\sqrt{6}}
\end{pmatrix} R = Q T A = ⎝ ⎜ ⎛ 2 0 0 2 3 1 2 3 0 1 1 2 2 6 2 ⎠ ⎟ ⎞
Idea. Given an inconsistent system A x = b Ax=b A x = b (i.e. no solutions), find x ^ \hat{x} x ^ such that A x ^ A\hat{x} A x ^ is as close to b b b as possible.
In other words, we want to find the point A x ^ A\hat{x} A x ^ in the column space of A A A that is closest to b b b .
Definition. If A A A is an m × n m\times n m × n matrix and b b b is in R m \R^m R m , a least squares solution of A x = b Ax=b A x = b is an x ^ \hat{x} x ^ in R n \R^n R n such that ∥ b − A x ^ ∥ ≤ ∥ b − A x ∥ \lVert b-A\hat{x} \rVert \leq \lVert b-Ax \rVert ∥ b − A x ^ ∥ ≤ ∥ b − A x ∥ for all x x x in R n \R^n R n .
Let A A A be an m × n m\times n m × n matrix.
Q1 : What is the point in the column space of A A A that is closest to b b b ?
A1 : The projection of b b b onto Col A \text{Col}A Col A .
Take b ^ = proj A ( b ) \hat{b}=\text{proj}_A(b) b ^ = proj A ( b ) .
Q2 : What space will the least squares solution x x x be from?
A2 : R n \R^n R n .
Now all we have to do is solve A x = b ^ Ax=\hat{b} A x = b ^ , which we know must be consistent since b ^ \hat{b} b ^ is in Col A \text{Col}A Col A . (However, there may be infinitely many solutions.)
Note that any column of A A A (say a j a_j a j ) is orthogonal to the component of b ^ \hat{b} b ^ (which is b − A x b-Ax b − A x ), so a j ⋅ ( b − A x ) = 0 a_j\cdot(b-Ax)=0 a j ⋅ ( b − A x ) = 0 .
a j T ( b − A x ) = 0 a_j^T(b-Ax)=0 a j T ( b − A x ) = 0
A T ( b − A x ) = 0 A^T(b-Ax)=0 A T ( b − A x ) = 0