9 minute read

Eigenvalue Decomposition

Motivation: Similarity

Consider a vector space $V$ over field $F$ with two bases \(\mathcal{B}_1\) and \(\mathcal{B}_2\). Consider a linear map $T: V \to V$, and $A$ where $A$ is corresponding matrix of linear map $T$ with regard to \(\mathcal{B}_1\).

And, let’s find the representation of matrix of $T$, $B$, in another basis \(\mathcal{B}_2\). By change of basis of linear transformation, for some transition matrix $P$,

\[\begin{aligned} B = PAP^{-1} \text{ and } A = P^{-1} B P \end{aligned}\]

In other words, since $A$ and $B$ are representations of different bases of the same linear transformation, it can be considered same in terms of transformation of the vector in $V$. For this reason, we call the two matrices similar.

$\mathbf{Definition.}$ Similar
Two matrices $A$ and $B$ are similar if there exists an invertible matrix $P$ such that $A = P^{-1} B P$.

As the name implies, similar matrices share some characteristic information like eigenvalues. Therefore, if a matrix is represented by a simpler matrix through base transformation, it can be easily understood by linear transformation or matrix, one of which is diagonal matrix.

$\mathbf{Thm\ 1.}$ Similarity Invariants
Similar matrices have the same

(a) determinant
(b) rank
(c) nullity
(d) trace
(e) characteristic polynomial & eigenvalues with the same algebraic, geometric multiplicity

$\mathbf{Proof.}$

(a) $\text{det}(B) = \text{det}(P^{-1}) \text{ det}(A) \text{ det}(P)$

(b), (c) By rank-nullity theorem, it suffices to show (c) only.
For invertible $P$, it is clear that $\text{null}(PA) = \text{null}(A)$. Thus, $\text{nullity}(PA) = \text{nullity}(BP) = \text{nullity}(A)$. Again, $\text{nullity}(BP) = \text{nullity}(P^{-1}BP)$, done.

(d) Since $\text{tr}(AB) = \text{tr}(BA)$ for square matrices $A$ and $B$, $\text{tr}(B) = \text{tr}(P^{-1} A P) = \text{tr}(APP^{-1}) = \text{tr}(A)$.

(e) We should show $\text{det}(\lambda I - C) = \text{det}(\lambda I - A)$. Since

\[\begin{aligned} \lambda I - C &= \lambda I - P^{-1} A P \\ &= \lambda P^{-1} P - P^{-1} A P \\ &= P^{-1} (\lambda P - AP) \\ &= P^{-1} (\lambda I - A) P \end{aligned}\]

$\lambda I - C$ and $\lambda I - A$ are similar! So, their null spaces are also equal.




Diagonalization

$\mathbf{Definition.}$ Diagonalizable
If a square matrix $A$ is similar with some diagonal matrix, we say $A$ is diagonalizable.


Now we prove the main theorem:

$\mathbf{Thm\ 2.}$ Diagonalizablity
An $n \times n$ matrix $A$ is diagonalizable if and only if it has $n$ linearly independent eigenvectors.

$\mathbf{Proof.}$

Suppose $A$ has $n$ linearly independent eigenvectors, \(\mathbf{p}_1, \cdots, \mathbf{p}_n\). Then, construct

\[\begin{aligned} P = [\mathbf{p}_1 \; | \; \cdots \; | \; \mathbf{p}_n] \end{aligned}\]

which is invertible as it has full column rank. Let $\lambda_1, \cdots, \lambda_n$ be corresponding eigenvalues and $D = \text{diag}(\lambda_i)$. It is obvious: \(A \mathbf{p}_i = \lambda_i \mathbf{p}_i \to AP = PD\).

Conversely, suppose $P^{-1} A P = D = \text{diag}(\lambda_1, \cdots, \lambda_n)$. Then,

\[\begin{aligned} AP &= A[ \mathbf{p}_1 \; | \; \cdots \; | \; \mathbf{p}_n ] = [ A\mathbf{p}_1 \; | \; \cdots \; | \; A\mathbf{p}_n ] \\ PD &= [ \lambda_1 \mathbf{p}_1 \; | \; \cdots \; | \; \lambda_n \mathbf{p}_n ] \end{aligned}\]

Thus, \(A\mathbf{p}_i = \lambda_i \mathbf{p}_i\) for all $i = 1, \cdots, n$, and it means \(\mathbf{p}_i\)’s are eigenvectors of $A$ corresponding to $\lambda_i$. Since $P$ is invertible, column vectors of $P$ are linearly independent. Done.



Hence, if we find the $n$ independent eigenvectors of $n \times n$ matrix $A$, the similarity of $A$ and $D = \text{diag}(\lambda_1, \cdots, \lambda_n)$ is ensured, i.e., $A = P^{-1} D P$.

Let’s look for more conditions besides this. Instead of the eigenvectors, eigenvalus also confirm the possibility of diagonalization of $A$.

$\mathbf{Thm\ 3.}$ An $n \times n$ matrix $A$ is diagonalizable if it has $n$ distinct eigenvalues. (Converse may be false.)

$\mathbf{Proof.}$

It suffices to show: if \(\mathbf{v}_1, \cdots, \mathbf{v}_n\) are eigenvectors of $A$ that correspond to distinct eigenvalues $\lambda_1, \cdots, \lambda_n$, then the set \(\{ \mathbf{v}_1, \cdots, \mathbf{v}_n \}\) is linearly independent.

Suppose not. Let $j$ be the maximal $j$ such that \(\mathbf{v}_1, \cdots, \mathbf{v}_j\) are independent. Then, there exists \(c_i \; (1 \leq i \leq j)\) such that

\[\begin{aligned} \mathbf{v}_{j+1} = c_1 \mathbf{v}_1 + \cdots + c_{j} \mathbf{v}_{j} \end{aligned}\]

Multiply BHS by $A$:

\[\begin{aligned} \lambda_{j+1} \mathbf{v}_{j+1} &= c_1 \lambda_1 \mathbf{v}_1 + \cdots + c_{j} \lambda_j \mathbf{v}_{j} \\ &= c_1 \lambda_{j+1} \mathbf{v}_1 + \cdots + c_{j} \lambda_{j+1} \mathbf{v}_{j} \end{aligned}\]

Thus

\[\begin{aligned} \sum_{i=1}^j (\lambda_i - \lambda_{j+1}) c_i \mathbf{v}_i = \mathbf{0} \end{aligned}\]

and it implies $c_i = 0$ for all $1 \leq i \leq j$, and \(\mathbf{v}_{j+1} = \mathbf{0}\): contradiction since we defined eigenvectors be nonzero.



And we can formulate the diagonalizability in terms of algebraic and geometric multiplicities of eigenvalues. The following theorem is useful to prove that:

$\mathbf{Thm\ 4.}$ Relationship between algebraic and geometric multiplicity
Let $A$ is a square matrix. The geometric mulitiplicity of an eigenvalue of $A$ is less than or equal to its algebraic multiplicity.

$\mathbf{Proof.}$

Suppose geometric multiplicity of $\lambda_0$ be $k$. Let \(\{ \mathbf{v}_1, \cdots, \mathbf{v}_k \}\) be a basis for the eigenspace of $\lambda_0$.

Then, we can extend it to a basis \(\{ \mathbf{v}_1, \cdots, \mathbf{v}_k, \cdots, \mathbf{v}_n \}\) of a vector space $V$. Let

\[\begin{aligned} P=[\mathbf{v}_1 \; | \; \cdots \; | \; \mathbf{v}_k \; | \; \mathbf{v}_{k+1} \; | \; \cdots \; | \; \mathbf{v}_n ] = [B_1 \; | \; B_2] \end{aligned}\]

Note that $P$ is invertible as it has full column rank. So

\[\begin{aligned} AP &= [ \lambda_0 \mathbf{p}_1 \; | \; \cdots \; | \; \lambda_0 \mathbf{v}_k \; | \; AB_2] \\ \end{aligned}\]

And, we will construct a similar matrix $C$ with $A$ which has $k$ or more algebraic multiplicity of $\lambda_0$. Let

\[\begin{aligned} C=\left[\begin{array}{cc} \lambda_0 I_k & X \\ 0 & Y \end{array}\right] \end{aligned}\]

Then,

\[\begin{aligned} AP &= [ \lambda_0 \mathbf{p}_1 \; | \; \cdots \; | \; \lambda_0 \mathbf{v}_k \; | \; PZ] \\ \end{aligned}\]

where \(Z = \left[\begin{array}{l} X \\ Y \end{array}\right]\)

Since $P$ is invertible, we can set a specific $Z$ such that $Z = P^{-1} A B_2$, so we can say $AP = PC \to C = P^{-1} AP$. Note that $A$ and $C$ have the same characteristic polynomial, and $\text{det}(\lambda I - C) = (\lambda - \lambda_0)^k \text{det} (\lambda I_{n - k} - Y)$. Thus, algebraic multiplicity of $\lambda_0$ is greater than or equal to $k$.



Finally, $A$ is diagonalizable if and only if the sum of the geometric multiplicities of its eigenvalues is $n$, and by the previous theorem, it turns out that $A$ is diagonalizable if and only if the geometric multiplicities of each eigenvalue of $A$ is the same as its algebraic multiplicity.

$\mathbf{Thm\ 5.}$
(a) $A$ is diagonalizable if and only if the sum of the geometric multiplicities of its eigenvalues is $n$.
(b) $A$ is diagonalizable if and only if the geometric multiplicities of each eigenvalue of $A$ is the same as its algebraic multiplicity.

$\mathbf{Proof.}$

Since the sum of algebraic multiplicity is $n$, it suffices to show (a) only.

Let $\lambda_1, \cdots, \lambda_k$ be the $k$ distinct eigenvalues of $A$. Let $E_1, \cdots, E_k$ be the corresponding eigenspaces, let $B_1, \cdots, B_k$ be any bases for these eigenspaces. Let $d_i = \text{dim}(B_i)$ be geometric multiplicity of $\lambda_i$ where $B_i$ is a basis of $E_i$.

Denote \(B_i = \{ \mathbf{v}_{i1}, \cdots, \mathbf{v}_{i d_i} \}\). Set \(B = \{ \mathbf{v}_{11}, \cdots, \mathbf{v}_{1 d_1}, \cdots, \mathbf{v}_{k1}, \cdots, \mathbf{v}_{k d_k} \}\). Since $\sum_{i=1}^k d_i = n$, $B$ is a set of $n$ linearly independent eigenvectors by $\mathbf{Thm\ 4.}$, which implies $A$ is diagonalizable.



Orthogonal Diagonalizability

So far, we see that linearly indepedence of eigenvectors ensures the digonalizability of $A$, i.e., $A = P^{-1} D P$ for invertible matrix $P$ whose columns correspond to eigenvectors. Although we know the form of $P$, it is sometimes annoying to find an inverse of a matrix. Fortunately, there is a special case of diagonalization $A = P^\top D P$ that saves us all this trouble, called orthogonally diagonalizable. We will follow a similar direction as before, and under what conditions $A$ can be decomposed as follows.

$\mathbf{Definition.}$ Orthogonal similar
Two matrices $A$ and $B$ are similar if there exists an orthogonal matrix $P$ such that $A = P^{\top} B P$.

Similarly with $\mathbf{Thm\ 2}$, the orthogonal diagonalizability of $A$ can be ensured by eigenvectors. I will omit the proof as it is self-evident:

$\mathbf{Thm\ 6.}$ Orthogonally Diagonalizablity
An $n \times n$ matrix $A$ is orthogonally diagonalizable if and only if there exists an orthonormal set of $n$ eigenvectors of $A$.

$\mathbf{Thm\ 7.}$ Diagonalization of real symmetric matrices
An $n \times n$ matrix $A$ is orthogonally diagonalizable if and only if $A$ is real symmetric.

$\mathbf{Proof.}$

Let $\lambda_0$ be an eigenvalue of $A$ and $k$ be geometric multiplicity of $\lambda_0$. Let \(B_0 = \{ \mathbf{u}_1, \cdots, \mathbf{u}_k \}\) be an otrthonormal basis for the eigenspace of $\lambda_0$.

Then, we can extend it to an orthonormal basis \(B = \{ \mathbf{u}_1, \cdots, \mathbf{u}_n \}\) for $\mathbb{R}^n$ by Gram-Schmidt process. Let $P$ be a matrix whose columns are \(\mathbf{u}_i\). Then

\[\begin{align*} AP = P \left[\begin{array}{cc} \lambda_0 I_k & X \\ 0 & Y \end{array}\right] = PC. \end{align*}\]

Hence $C = P^{-1} A P = P^\top A P \to C^\top = P^\top A^\top P = C$, which implies $C$ is also symmetric and $X = 0$. Also, characteristic polynomials of $A$ and $C$ are same: $(\lambda - \lambda_0)^k \text{ det}(\lambda I_{n-k} - Y) = (\lambda - \lambda_0)^k p_Y (\lambda)$ as they are similar.

Finally, we claim that $p_Y (\lambda_0) \neq 0$ which suggests that the algebraic multiplicity of $\lambda_0$ is exactly $k$.

Suppose not. Then, there exists $\mathbf{y} \in \mathbb{R}^{n-k}$ such that $Y \mathbf{y} = \lambda_0 \mathbf{y}$. Let \(\mathbf{z} = \left[\begin{array}{cc} \mathbf{0} \\ \mathbf{y} \end{array}\right]\) be the vector in $\mathbb{R}^n$.

\[C x=\left[\begin{array}{c:c} \lambda_0 I_{n-k} \\ \hdashline 0 & Y \end{array}\right]\left[\begin{array}{l} 0 \\ \hdashline \mathbf{y} \end{array}\right]=\left[\begin{array}{c} 0 \\ \hdashline \lambda_0 \mathbf{y} \end{array}\right]=\lambda_0 \mathbf{z}\]

So $\mathbf{z}$ is an eigenvector of $C$, and $P \mathbf{z}$ is eigenvector of $A$. Also, \(\mathbf{e}_1, \cdots, \mathbf{e}_k\) are also eigenvectors of $C$. Thus, \(\{ \mathbf{e}_1, \cdots, \mathbf{e}_k, \mathbf{z} \}\) are linearly independent.

That means \(\{ P\mathbf{e}_1, \cdots, P\mathbf{e}_k, P\mathbf{z} \}\) are linearly independent, and it contradicts: geometric multiplicity of $\lambda_0$ is larger than $k$.

Thus, for any eigenvalue of $A$, the algebraic multiplicity is exactly same with the geometric multiplicity, it is done.



Summary

Let $A$ be $n \times n$ real symmetric matrix. Then, we saw that there exists orthogonal matrix such that $P^\top A P = D$ where $D$ is a diagonal matrix.

Let

\[\begin{aligned} P = [ \mathbf{u}_1 \; | \; \cdots \; | \; \mathbf{u}_n ] \end{aligned}\]

, and $D = \text{diag}(\lambda_1, \cdots, \lambda_n)$. Then, we can represent $A$ as a linear combination of rank 1 matrices

\[\begin{aligned} A = PDP^\top = \lambda_1 \mathbf{u}_1 \mathbf{u}_1^\top + \cdots + \lambda_n \mathbf{u}_n \mathbf{u}_n^\top \end{aligned}\]

and it is called spectral decomposition of $A$, or eigenvalue decomposition of $A$.



Reference

[1] Contemporary Linear Algebra, Howard Anton, Robert C. Busby

Leave a comment