[Linear Algebra] Eigenvalue Decomposition
Eigenvalue DecompositionPermalink
Motivation: SimilarityPermalink
Consider a vector space V over field F with two bases B1 and B2. Consider a linear map T:V→V, and A where A is corresponding matrix of linear map T with regard to B1.
And, let’s find the representation of matrix of T, B, in another basis B2. By change of basis of linear transformation, for some transition matrix P,
B=PAP−1 and A=P−1BPIn other words, since A and B are representations of different bases of the same linear transformation, it can be considered same in terms of transformation of the vector in V. For this reason, we call the two matrices similar.
Definition. Similar
Two matrices A and B are similar if there exists an invertible matrix P such that A=P−1BP.
As the name implies, similar matrices share some characteristic information like eigenvalues. Therefore, if a matrix is represented by a simpler matrix through base transformation, it can be easily understood by linear transformation or matrix, one of which is diagonal matrix.
Thm 1. Similarity Invariants
Similar matrices have the same
(a) determinant
(b) rank
(c) nullity
(d) trace
(e) characteristic polynomial & eigenvalues with the same algebraic, geometric multiplicity
Proof.
(a) det(B)=det(P−1) det(A) det(P)
(b), (c) By rank-nullity theorem, it suffices to show (c) only.
For invertible P, it is clear that null(PA)=null(A). Thus, nullity(PA)=nullity(BP)=nullity(A). Again, nullity(BP)=nullity(P−1BP), done.
(d) Since tr(AB)=tr(BA) for square matrices A and B, tr(B)=tr(P−1AP)=tr(APP−1)=tr(A).
(e) We should show det(λI−C)=det(λI−A). Since
λI−C=λI−P−1AP=λP−1P−P−1AP=P−1(λP−AP)=P−1(λI−A)PλI−C and λI−A are similar! So, their null spaces are also equal.
DiagonalizationPermalink
Definition. Diagonalizable
If a square matrix A is similar with some diagonal matrix, we say A is diagonalizable.
Now we prove the main theorem:
Thm 2. Diagonalizablity
An n×n matrix A is diagonalizable if and only if it has n linearly independent eigenvectors.
Proof.
Suppose A has n linearly independent eigenvectors, p1,⋯,pn. Then, construct
P=[p1|⋯|pn]which is invertible as it has full column rank. Let λ1,⋯,λn be corresponding eigenvalues and D=diag(λi). It is obvious: Api=λipi→AP=PD.
Conversely, suppose P−1AP=D=diag(λ1,⋯,λn). Then,
AP=A[p1|⋯|pn]=[Ap1|⋯|Apn]PD=[λ1p1|⋯|λnpn]Thus, Api=λipi for all i=1,⋯,n, and it means pi’s are eigenvectors of A corresponding to λi. Since P is invertible, column vectors of P are linearly independent. Done.
Hence, if we find the n independent eigenvectors of n×n matrix A, the similarity of A and D=diag(λ1,⋯,λn) is ensured, i.e., A=P−1DP.
Let’s look for more conditions besides this. Instead of the eigenvectors, eigenvalus also confirm the possibility of diagonalization of A.
Thm 3. An n×n matrix A is diagonalizable if it has n distinct eigenvalues. (Converse may be false.)
Proof.
It suffices to show: if v1,⋯,vn are eigenvectors of A that correspond to distinct eigenvalues λ1,⋯,λn, then the set {v1,⋯,vn} is linearly independent.
Suppose not. Let j be the maximal j such that v1,⋯,vj are independent. Then, there exists ci(1≤i≤j) such that
vj+1=c1v1+⋯+cjvjMultiply BHS by A:
λj+1vj+1=c1λ1v1+⋯+cjλjvj=c1λj+1v1+⋯+cjλj+1vjThus
j∑i=1(λi−λj+1)civi=0and it implies ci=0 for all 1≤i≤j, and vj+1=0: contradiction since we defined eigenvectors be nonzero.
And we can formulate the diagonalizability in terms of algebraic and geometric multiplicities of eigenvalues. The following theorem is useful to prove that:
Thm 4. Relationship between algebraic and geometric multiplicity
Let A is a square matrix. The geometric mulitiplicity of an eigenvalue of A is less than or equal to its algebraic multiplicity.
Proof.
Suppose geometric multiplicity of λ0 be k. Let {v1,⋯,vk} be a basis for the eigenspace of λ0.
Then, we can extend it to a basis {v1,⋯,vk,⋯,vn} of a vector space V. Let
P=[v1|⋯|vk|vk+1|⋯|vn]=[B1|B2]Note that P is invertible as it has full column rank. So
AP=[λ0p1|⋯|λ0vk|AB2]And, we will construct a similar matrix C with A which has k or more algebraic multiplicity of λ0. Let
C=[λ0IkX0Y]Then,
AP=[λ0p1|⋯|λ0vk|PZ]where Z=[XY]
Since P is invertible, we can set a specific Z such that Z=P−1AB2, so we can say AP=PC→C=P−1AP. Note that A and C have the same characteristic polynomial, and det(λI−C)=(λ−λ0)kdet(λIn−k−Y). Thus, algebraic multiplicity of λ0 is greater than or equal to k.
Finally, A is diagonalizable if and only if the sum of the geometric multiplicities of its eigenvalues is n, and by the previous theorem, it turns out that A is diagonalizable if and only if the geometric multiplicities of each eigenvalue of A is the same as its algebraic multiplicity.
Thm 5.
(a) A is diagonalizable if and only if the sum of the geometric multiplicities of its eigenvalues is n.
(b) A is diagonalizable if and only if the geometric multiplicities of each eigenvalue of A is the same as its algebraic multiplicity.
Proof.
Since the sum of algebraic multiplicity is n, it suffices to show (a) only.
Let λ1,⋯,λk be the k distinct eigenvalues of A. Let E1,⋯,Ek be the corresponding eigenspaces, let B1,⋯,Bk be any bases for these eigenspaces. Let di=dim(Bi) be geometric multiplicity of λi where Bi is a basis of Ei.
Denote Bi={vi1,⋯,vidi}. Set B={v11,⋯,v1d1,⋯,vk1,⋯,vkdk}. Since ∑ki=1di=n, B is a set of n linearly independent eigenvectors by Thm 4., which implies A is diagonalizable.
Orthogonal DiagonalizabilityPermalink
So far, we see that linearly indepedence of eigenvectors ensures the digonalizability of A, i.e., A=P−1DP for invertible matrix P whose columns correspond to eigenvectors. Although we know the form of P, it is sometimes annoying to find an inverse of a matrix. Fortunately, there is a special case of diagonalization A=P⊤DP that saves us all this trouble, called orthogonally diagonalizable. We will follow a similar direction as before, and under what conditions A can be decomposed as follows.
Definition. Orthogonal similar
Two matrices A and B are similar if there exists an orthogonal matrix P such that A=P⊤BP.
Similarly with Thm 2, the orthogonal diagonalizability of A can be ensured by eigenvectors. I will omit the proof as it is self-evident:
Thm 6. Orthogonally Diagonalizablity
An n×n matrix A is orthogonally diagonalizable if and only if there exists an orthonormal set of n eigenvectors of A.
Thm 7. Diagonalization of real symmetric matrices
An n×n matrix A is orthogonally diagonalizable if and only if A is real symmetric.
Proof.
Let λ0 be an eigenvalue of A and k be geometric multiplicity of λ0. Let B0={u1,⋯,uk} be an otrthonormal basis for the eigenspace of λ0.
Then, we can extend it to an orthonormal basis B={u1,⋯,un} for Rn by Gram-Schmidt process. Let P be a matrix whose columns are ui. Then
AP=P[λ0IkX0Y]=PC.Hence C=P−1AP=P⊤AP→C⊤=P⊤A⊤P=C, which implies C is also symmetric and X=0. Also, characteristic polynomials of A and C are same: (λ−λ0)k det(λIn−k−Y)=(λ−λ0)kpY(λ) as they are similar.
Finally, we claim that pY(λ0)≠0 which suggests that the algebraic multiplicity of λ0 is exactly k.
Suppose not. Then, there exists y∈Rn−k such that Yy=λ0y. Let z=[0y] be the vector in Rn.
Cx=[λ0In−k0Y][0y]=[0λ0y]=λ0zSo z is an eigenvector of C, and Pz is eigenvector of A. Also, e1,⋯,ek are also eigenvectors of C. Thus, {e1,⋯,ek,z} are linearly independent.
That means {Pe1,⋯,Pek,Pz} are linearly independent, and it contradicts: geometric multiplicity of λ0 is larger than k.
Thus, for any eigenvalue of A, the algebraic multiplicity is exactly same with the geometric multiplicity, it is done.
SummaryPermalink
Let A be n×n real symmetric matrix. Then, we saw that there exists orthogonal matrix such that P⊤AP=D where D is a diagonal matrix.
Let
P=[u1|⋯|un], and D=diag(λ1,⋯,λn). Then, we can represent A as a linear combination of rank 1 matrices
A=PDP⊤=λ1u1u⊤1+⋯+λnunu⊤nand it is called spectral decomposition of A, or eigenvalue decomposition of A.
ReferencePermalink
[1] Contemporary Linear Algebra, Howard Anton, Robert C. Busby
Leave a comment