4.9. Eigen Values#

In this subsection, we discuss the eigenvalues and eigenvectors of square matrices. Much of the discussion in this section will be equally applicable to real as well as complex square matrices. We will use the complex notation mostly and make specific remarks for real matrices wherever needed. Of specific interest is the eigenvalue decomposition of real symmetric matrices.

4.9.1. Eigen Values#

Definition 4.112 (Eigen value)

A scalar λ is an eigen value of an n×n matrix A=[aij] if there exists a nonzero vector x such that

(4.9)#Ax=λx.

A nonzero vector x which satisfies this equation is called an eigen vector of A for the eigen value λ.

An eigen value is also known as a characteristic value, proper value or a latent value.

We note that (4.9) can be written as

Ax=λInx(AλIn)x=0.

Thus λ is an eigen value of A if and only if the matrix AλI is singular.

Definition 4.113 (Spectrum of eigen values)

The set comprising of eigen values of a matrix A is known as its spectrum.

Remark 4.13 (Uniqueness of eigen value for an eigen vector)

For each eigen vector x for a matrix A, the corresponding eigen value λ is unique.

Proof. We show this by contradiction.

  1. Assume that for x there are two eigen values λ1 and λ2.

  2. Then

    Ax=λ1x=λ2x(λ1λ2)x=0.
  3. This can happen only when either x=0 or λ1=λ2.

  4. Since x is an eigen vector, it cannot be 0.

  5. Thus λ1=λ2.

Remark 4.14

If x is an eigen vector for A, then the corresponding eigen value is given by

λ=xHAxxHx.

Proof. We proceed as follows:

Ax=λxxHAx=λxHxλ=xHAxxHx.

since x is nonzero.

Remark 4.15

An eigen vector x of A for eigen value λ belongs to the nullspace of AλI; i.e.,

xN(AλI).

In other words x is a nontrivial solution to the homogeneous system of linear equations given by

(AλI)z=0.

We can put together eigen vectors of a matrix into another matrix by itself. This can be very useful tool. We start with a simple idea.

Lemma 4.50 (Matrix of eigen vectors)

Let A be an n×n matrix. Let u1,u2,,ur be r nonzero vectors from Fn. Let us construct an n×r matrix

U=[u1u2ur].

Then all the r vectors are eigen vectors of A if and only if there exists a diagonal matrix D=diag(d1,,dr) such that

AU=UD.

Proof. Expanding the equation, we can write

[Au1Au2Aur]=[d1u1d2u2drur].

Clearly we want

Aui=diui

where ui are nonzero. This is possible only when di is an eigen value of A and ui is an eigen vector for di.

Converse: Assume that ui are eigen vectors. Choose di to be corresponding eigen values. Then the equation holds.

This idea will be extended further in building the eigen value decomposition of matrices.

Observation 4.5

Let ARn×n. Let λ be a nonzero real eigen value of A and xRn be an eigenvector of A for λ. Then Ax and x are collinear.

In other words the angle between Ax and x is either 0 when λ is positive and is 180 when λ is negative. Let us look at the inner product:

Ax,x=xTAx=xHλx=λx22.

Meanwhile

Ax2=λx2=|λ|x2.

Thus

|Ax,x|=Ax2x2.

The angle θ between Ax and x is given by

cosθ=Ax,xAx2x2=λx22|λ|x22=±1.

4.9.2. Eigen Space#

Definition 4.114 (Eigen space)

Let λ be an eigen value for a square matrix A. Then its eigen space is the nullspace of AλI i.e. N(AλI).

Remark 4.16 (Eigen vectors for an eigen value)

The set comprising all the eigen vectors of A for an eigen value λ is given by

N(AλI){0}

since 0 cannot be an eigen vector.

Definition 4.115 (Geometric multiplicity)

Let λ be an eigen value for a square matrix A. The dimension of its eigen space N(AλI) is known as the geometric multiplicity of the eigen value λ.

We can see that

(4.10)#dim(N(AλI))=nrank(AλI).

4.9.3. Characteristic Polynomial#

Remark 4.17 (Singularity of AλI)

A scalar λ can be an eigen value of a square matrix A if and only if

det(AλI)=0.

det(AλI) is a polynomial in λ of degree n.

Observation 4.6 (Eigenvalues as zeros of a polynomial)

We can write the determinant as a polynomial:

det(AλI)=p(λ)=αnλn+αn1λn1++α1λ+α0

where αi depend on entries in A. In this sense, an eigenvalue of A is a root of the equation

p(λ)=0.

It is easy to show that αn=(1)n.

Definition 4.116 (Characteristic polynomial)

For any square matrix A, the polynomial given by

p(λ)=det(AλI)

is known as its characteristic polynomial. The equation give by

p(λ)=0

is known as its characteristic equation. The eigen values of A are the roots of its characteristic polynomial or solutions of its characteristic equation.

Observation 4.7 (Factors of characteristic polynomial)

For real square matrices, if we restrict eigen values to real values, then the characteristic polynomial can be factored as

(4.11)#p(λ)=(1)n(λλ1)r1(λλk)rkq(λ).

The polynomial has k distinct real roots. For each root λi, ri is a positive integer indicating how many times the root appears. q(λ) is a polynomial that has no real roots. The following is true

r1++rk+deg(q(λ))=n.

Clearly kn.

For complex square matrices where eigen values can be complex (including real square matrices), the characteristic polynomial can be factored as

(4.12)#p(λ)=(1)n(λλ1)r1(λλk)rk.

The polynomial can be completely factorized into first degree polynomials. There are k distinct roots or eigen values. The following is true

r1++rk=n.

Thus, including the duplicates there are exactly n eigen values for a complex square matrix.

It is quite possible that a real square matrix doesn’t have any real eigen values.

Definition 4.117 (Algebraic multiplicity)

The number of times an eigen value appears in the factorization of the characteristic polynomial of a square matrix A is known as its algebraic multiplicity. In other words, ri is the algebraic multiplicity for λi in the factorization (4.11) and (4.12).

Theorem 4.119 (Geometric and algebraic multiplicities)

Let λ be an eigen value of a square matrix A. Then the geometric multiplicity of λ is less than or equal to its algebraic multiplicity.

Corollary 4.16 (Distinct eigen values and multiplicity)

If an n×n matrix A has n distinct eigen values, then each of them has a geometric (and algebraic) multiplicity of 1.

Proof. The algebraic multiplicity of an eigen value is greater than or equal to 1. But the sum cannot exceed n. Since there are n distinct eigen values, thus each of them has algebraic multiplicity of 1. Further, geometric multiplicity of an eigen value is greater than equal to 1 and less than equal to its algebraic multiplicity.

Corollary 4.17

Let an n×n matrix A has k distinct eigen values λ1,λ2,,λk with algebraic multiplicities r1,r2,,rk and geometric multiplicities g1,g2,gk respectively. Then

i=1kgki=1krkn.

Moreover if

i=1kgi=i=1kri

then

gi=rii=1,,k.

Remark 4.18 (Spectrum from roots of characteristic polynomial)

In the factorization (4.12) of the characteristic polynomial p(λ), the set {λ1,,λk} forms the spectrum of A.

Let us consider the sum of ri which gives the count of total number of roots of p(λ).

m=i=1kri.

With this there are m not-necessarily distinct roots of p(λ). Let us write p(λ) from (4.11) as

p(λ)=(1)n(λc1)(λc2)(λcm)q(λ).

where c1,c2,,cm are m scalars (not necessarily distinct) of which r1 scalars are λ1, r2 are λ2 and so on. For the complex case, we have q(λ)=1.

We will refer to the set (allowing repetitions) {c1,c2,,cm} as the eigen values of the matrix A where ci are not necessarily distinct. In contrast the spectrum of A refers to the set of distinct eigen values of A. The symbol c has been chosen based on the other name for eigen values (the characteristic values).

4.9.4. The Zero Eigen Value#

Lemma 4.51 (0 as an eigen value)

0 is an eigen value of a square matrix A if and only if A is singular.

Proof. Let 0 be an eigen value of A.

  1. Then there exists u0 such that

    Au=0u=0.
  2. Thus u is a non-trivial solution of the homogeneous linear system Ax=0.

  3. Thus A is singular.

Converse:

  1. Assume that A is singular.

  2. Then there exists u0 s.t.

    Au=0=0u.
  3. Thus 0 is an eigen value of A.

Lemma 4.52 (Eigenspace of the zero eigen value)

If a square matrix A is singular, then N(A) is the eigen space for the eigen value λ=0.

Proof. This is straight forward from the definition of eigen space (see Definition 4.114).

Remark 4.19

Clearly the geometric multiplicity of λ=0 equals nullity(A)=nrank(A).

4.9.5. Transpose#

Lemma 4.53 (Eigen values of the transpose)

Let A be a square matrix. Then A and AT have same eigen values.

Proof. The eigen values of AT are given by

det(ATλI)=0.

But

ATλI=AT(λI)T=(AλI)T.

Hence (using Lemma 4.33)

det(ATλI)=det((AλI)T)=det(AλI).

Thus the characteristic polynomials of A and AT are same. Hence the eigen values are same. In other words the spectrum of A and AT are same.

4.9.6. Powers and Inverses#

Lemma 4.54 (Power rule)

Let A be a square matrix and λ be an eigen value of A. Let pN. Then λp is an eigen value of Ap.

Proof. We prove this by mathematical induction.

  1. For p=1 the statement holds trivially since λ1 is an eigen value of A1.

  2. Assume that the statement holds for some value of p.

  3. Thus let λp be an eigen value of Ap and let u be corresponding eigen vector.

  4. Now

    Ap+1u=Ap(Au)=Apλu=λApu=λλpu=λp+1u.
  5. Thus λp+1 is an eigen value for Ap+1 with the same eigen vector u.

  6. With the principle of mathematical induction, the proof is complete.

Lemma 4.55 (Eigenvalue of the inverse)

Let a square matrix A be non singular and let λ0 be some eigen value of A. Then λ1 is an eigen value of A1. Moreover, all eigen values of A1 are obtained by taking inverses of eigen values of A; i.e., if μ0 is an eigen value of A1 then 1μ is an eigen value of A also. Also, A and A1 share the same set of eigen vectors.

Proof. Since A is invertible, hence all its eigenvalues are nonzero.

  1. Let u0 be an eigen vector of A for the eigen value λ.

  2. Then

    Au=λuu=A1λu1λu=A1u.
  3. Thus u is also an eigen vector of A1 for the eigen value 1λ.

  4. Now let B=A1.

  5. Then B1=A.

  6. Thus if μ is an eigen value of B then 1μ is an eigen value of B1=A.

  7. Thus if A is invertible then eigen values of A and A1 have one to one correspondence (being reciprocal).

4.9.7. Invariant Subspaces#

Definition 4.118 (Invariant subspace)

Let A be a square n×n matrix and let W be a linear subspace of Fn. Then W is invariant relative to A if

AwWwW.

In other words, A(W)W or for every vector wW its mapping Aw is also in W. Thus action of A on W doesn’t take us outside of W.

We also say that W is A-invariant.

Eigen vectors are generators of invariant subspaces.

Lemma 4.56 (Spans of eigenvectors are invariant)

Let A be an n×n matrix. Let x1,x2,,xr be r eigen vectors of A. Let us construct an n×r matrix

X=[x1x2xr].

Then the column space of X; i.e., C(X) is invariant relative to A.

Proof. Let us assume that c1,c2,,cr are the eigen values corresponding to x1,x2,,xr (not necessarily distinct).

Let any vector xC(X) be given by

x=i=1rαixi.

Then

Ax=Ai=1rαixi=i=1rαiAxi=i=1rαicixi.

Clearly Ax is also a linear combination of xi. Hence it belongs to C(X). Thus C(X) is invariant relative to A or C(X) is A-invariant.

4.9.8. Triangular Matrices#

Lemma 4.57 (Eigenvalues of triangular matrices)

Let A be an n×n upper or lower triangular matrix. Then its eigen values are the entries on its main diagonal.

Proof. We are given that A is triangular.

  1. Then AλI is also triangular with its diagonal entries being (aiiλ).

  2. Using Lemma 4.38, we have

    p(λ)=det(AλI)=i=1n(aiiλ).
  3. Clearly the roots of characteristic polynomial are aii.

Several small results follow from this lemma.

Corollary 4.18 (Eigenvalues of triangular matrices)

Let A=[aij] be an n×n triangular matrix.

  • The characteristic polynomial of A is

    p(λ)=(1)n(λaii).
  • A scalar λ is an eigen value of A iff it is one of the diagonal entries of A.

  • The algebraic multiplicity of an eigen value λ is equal to the number of times it appears on the main diagonal of A.

  • The spectrum of A is given by the distinct entries on the main diagonal of A.

A diagonal matrix is naturally both an upper triangular matrix as well as a lower triangular matrix. Similar results hold for the eigen values of a diagonal matrix also.

Lemma 4.58 (Eigenvalues of diagonal matrices)

Let A=[aij] be an n×n diagonal matrix.

  • Its eigen values are the entries on its main diagonal.

  • The characteristic polynomial of A is

    p(λ)=(1)n(λaii).
  • A scalar λ is an eigen value of A iff it is one of the diagonal entries of A.

  • The algebraic multiplicity of an eigen value λ is equal to the number of times it appears on the main diagonal of A.

  • The spectrum of A is given by the distinct entries on the main diagonal of A.

There is also a result for the geometric multiplicity of eigen values for a diagonal matrix.

Lemma 4.59 (Geometric multiplicity of eigenvalues for diagonal matrices)

Let A=[aij] be an n×n diagonal matrix. The geometric multiplicity of an eigen value λ is equal to the number of times it appears on the main diagonal of A.

Proof. The unit vectors ei are eigen vectors for A since

Aei=aiiei.
  1. They are linearly independent.

  2. Thus if a particular eigen value appears r number of times, then there are r linearly independent eigen vectors for the eigen value.

  3. Thus its geometric multiplicity is equal to the algebraic multiplicity.

4.9.9. Similar Matrices#

Some very useful results are available for similar matrices.

Lemma 4.60 (Characteristic polynomial of similar matrices)

The characteristic polynomial and spectrum of similar matrices is same.

Proof. Let B be similar to A.

  1. Then there exists an invertible matrix C such that

    B=C1AC.
  2. Now

    BλI=C1ACλI=C1ACλC1C=C1(ACλC)=C1(AλI)C.
  3. Thus BλI is similar to AλI.

  4. Hence due to Lemma 4.40, their determinant is equal.

  5. In other words,

    det(BλI)=det(AλI).
  6. This means that the characteristic polynomials of A and B are same.

  7. Since eigen values are nothing but roots of the characteristic polynomial, hence they are same too.

  8. This means that the spectrum (the set of distinct eigen values) is same.

This result is very useful. Since if it can be shown that a matrix A is similar to a diagonal or a triangular matrix whose eigen values are easy to obtain then determination of the eigen values of A becomes straight forward.

Corollary 4.19 (Eigenvalues of similar matrices)

If A and B are similar to each other then

  • An eigen value has same algebraic and geometric multiplicity for both A and B.

  • The (not necessarily distinct) eigen values of A and B are same.

Although the eigen values are same, but the eigen vectors are different.

Lemma 4.61 (Eigenvectors of similar matrices)

Let A and B be similar with

B=C1AC

for some invertible matrix C. If u is an eigenvector of A for an eigenvalue λ, then C1u is an eigen vector of B for the same eigen value.

Proof. We are given that u is an eigen vector of A for an eigen value λ.

  1. Thus we have

    Au=λu.
  2. Thus

    BC1u=C1ACC1u=C1Au=C1λu=λC1u.
  3. Now u0 and C1 is non singular.

  4. Thus C1u0.

  5. Thus C1u is an eigen vector of B.

4.9.10. Linear Independence of Eigenvectors#

Theorem 4.120 (Linear independence of eigenvectors with distinct eigenvalues)

Let A be an n×n square matrix. Let x1,x2,,xk be any k eigen vectors of A for distinct eigen values λ1,λ2,,λk respectively. Then x1,x2,,xk are linearly independent.

Proof. We first prove the simpler case with 2 eigen vectors x1 and x2 and corresponding eigen values λ1 and λ2 respectively.

  1. Let there be a linear relationship between x1 and x2 given by

    α1x1+α2x2=0.
  2. Multiplying both sides with (Aλ1I) we get

    α1(Aλ1I)x1+α2(Aλ1I)x2=0α1(λ1λ1)x1+α2(λ2λ1)x2=0α2(λ2λ1)x2=0.
  3. Since λ1λ2 and x20, hence α2=0.

  4. Similarly by multiplying with (Aλ2I) on both sides, we can show that α1=0.

  5. Thus x1 and x2 are linearly independent.

  6. Now for the general case, consider a linear relationship between x1,x2,,xk given by

    α1x1+α2x2+αkxk=0.
  7. Multiplying by ij,i=1k(AλiI) and using the fact that λiλj if ij, we get αj=0.

  8. Thus the only linear relationship is the trivial relationship.

  9. This completes the proof.

For eigen values with geometric multiplicity greater than 1 there are multiple eigenvectors corresponding to the eigen value which are linearly independent. In this context, above theorem can be generalized further.

Theorem 4.121 (Linear independence of eigenvectors from different eigen spaces)

Let λ1,λ2,,λk be k distinct eigen values of A. Let {x1j,x2j,xgjj} be any gj linearly independent eigen vectors from the eigen space of λj where gj is the geometric multiplicity of λj. Then the combined set of eigen vectors given by

{x11,,xg11,,x1k,,xgkk}

consisting of j=1kgj eigen vectors is linearly independent.

This result puts an upper limit on the number of linearly independent eigen vectors of a square matrix.

Lemma 4.62 (Maximum number of linearly independent eigenvectors)

Let {λ1,,λk} represents the spectrum of an n×n matrix A. Let g1,,gk be the geometric multiplicities of λ1,λk respectively. Then the number of linearly independent eigen vectors for A is

i=1kgi.

Moreover if

i=1kgi=n

then a set of n linearly independent eigen vectors of A can be found which forms a basis for Fn.

4.9.11. Diagonalization#

Diagonalization is one of the fundamental operations in linear algebra. This section discusses diagonalization of square matrices in depth.

Definition 4.119 (diagonalizable matrix)

An n×n matrix A is said to be diagonalizable if it is similar to a diagonal matrix. In other words there exists an n×n non-singular matrix P such that D=P1AP is a diagonal matrix. If this happens then we say that P diagonalizes A or A is diagonalized by P.

Remark 4.20

D=P1APPD=APPDP1=A.

We note that if we restrict to real matrices, then U and D should also be real. If ACn×n (it may still be real) then P and D can be complex.

In other words, if ARn×n is diagonalizable then both P and D must be real.

The next theorem is the culmination of a variety of results studied so far.

Theorem 4.122 (Properties of diagonalizable matrices)

Let A be a diagonalizable matrix with D=P1AP being its diagonalization. Let D=diag(d1,d2,,dn). Then the following hold

  • rank(A)=rank(D) which equals the number of nonzero entries on the main diagonal of D.

  • det(A)=d1d2dn.

  • tr(A)=d1+d2+dn.

  • The characteristic polynomial of A is

    p(λ)=(1)n(λd1)(λd2)(λdn).
  • The spectrum of A comprises the distinct scalars on the diagonal entries in D.

  • The (not necessarily distinct) eigenvalues of A are the diagonal elements of D.

  • The columns of P are (linearly independent) eigenvectors of A.

  • The algebraic and geometric multiplicities of an eigenvalue λ of A equal the number of diagonal elements of D that equal λ.

Proof. From Definition 4.119 we note that D and A are similar.

  1. Due to Lemma 4.40

    det(A)=det(D).
  2. Due to Lemma 4.39

    det(D)=i=1ndi.
  3. Now due to Lemma 4.31

    tr(A)=tr(D)=i=1ndi.
  4. Further due to Lemma 4.60 the characteristic polynomial and spectrum of A and D are same.

  5. Due to Lemma 4.58 the eigen values of D are nothing but its diagonal entries.

  6. Hence they are also the eigen values of A.

  7. We recall that

    D=P1APAP=PD.
  8. Now writing

    P=[p1p2pn]

    we have

    AP=[Ap1Ap2Apn]=PD=[d1p1d2p2dnpn].
  9. Matching columns, we see that

    Api=dipii=1,,n.
  10. Thus pi are eigen vectors of A.

  11. Since P is invertible, hence its columns are linearly independent.

  12. Hence all the eigenvectors of A are linearly independent.

  13. Since the characteristic polynomials of A and D are same, hence the algebraic multiplicities of eigen values are same.

  14. From Lemma 4.61 we get that there is a one to one correspondence between the eigen vectors of A and D through the change of basis given by P.

  15. Thus the linear independence relationships between the eigen vectors remain the same.

  16. Hence the geometric multiplicities of individual eigenvalues are also the same.

  17. This completes the proof.

So far we have verified various results which are available if a matrix A is diagonalizable. We haven’t yet identified the conditions under which A is diagonalizable. We note that not every matrix is diagonalizable. The following theorem gives necessary and sufficient conditions under which a matrix is diagonalizable.

Theorem 4.123 (Diagonalizability: necessary and sufficient conditions)

An n×n matrix A is diagonalizable by an n×n nonsingular matrix P if and only if the columns of P are (linearly independent) eigenvectors of A.

Proof. We note that since P is nonsingular hence columns of P have to be linearly independent.

The necessary condition part was proven in Theorem 4.122. We now show that if P consists of n linearly independent eigen vectors of A then A is diagonalizable.

  1. Let the columns of P be p1,p2,,pn and corresponding (not necessarily distinct) eigen values be d1,d2,,dn.

  2. Then

    Api=dipi.
  3. Thus by letting D=diag(d1,d2,,dn), we have

    AP=PD.
  4. Since columns of P are linearly independent, hence P is invertible.

  5. This gives us

    D=P1AP.
  6. Thus A is similar to a diagonal matrix D.

  7. Hence A is diagonalizable.

  8. This validates the sufficient condition.

A corollary follows.

Corollary 4.20 (Diagonalizability and linear independence of eigen vectors)

An n×n matrix is diagonalizable if and only if there exists a linearly independent set of n eigenvectors of A.

Now we know that geometric multiplicities of eigen values of A provide us information about the number of linearly independent eigenvectors of A.

Corollary 4.21 (Diagonalizability and geometric multiplicities)

Let A be an n×n matrix. Let λ1,λ2,,λk be its k distinct eigen values (comprising its spectrum). Let gj be the geometric multiplicity of λj. Then A is diagonalizable if and only if

i=1ngi=n.

4.9.12. Real Symmetric Matrices#

This subsection is focused on real symmetric matrices.

Following is a fundamental property of real symmetric matrices.

Theorem 4.124 (Existence of eigenvalues)

Every real symmetric matrix has an eigen value.

The proof of this result is beyond the scope of this book.

Lemma 4.63 (Orthogonality of eigenvectors for distinct eigenvalues)

Let A be an n×n real symmetric matrix. Let λ1 and λ2 be any two distinct eigen values of A and let x1 and x2 be any two corresponding eigen vectors. Then x1 and x2 are orthogonal.

Proof. By definition we have Ax1=λ1x1 and Ax2=λ2x2. Thus

x2TAx1=λ1x2Tx1x1TATx2=λ1x1Tx2x1TAx2=λ1x1Tx2x1Tλ2x2=λ1x1Tx2(λ1λ2)x1Tx2=0x1Tx2=0.

Thus x1 and x2 are orthogonal. In between we took transpose on both sides, used the fact that A=AT and λ1λ20.

Definition 4.120 (Orthogonally diagonalizable matrix)

A real n×n matrix A is said to be orthogonally diagonalizable if there exists an orthogonal matrix U which can diagonalize A; i.e.,

D=UTAU

is a real diagonal matrix.

Lemma 4.64

Every orthogonally diagonalizable matrix A is symmetric.

Proof. We have a diagonal matrix D such that

A=UDUT.

Taking transpose on both sides we get

AT=UDTUT=UDUT=A.

Thus A is symmetric.

Theorem 4.125

Every symmetric matrix A is orthogonally diagonalizable.

We skip the proof of this theorem.

4.9.13. Hermitian Matrices#

Following is a fundamental property of Hermitian matrices.

Theorem 4.126 (Existence of eigen values for Hermitian matrices)

Every Hermitian matrix has an eigen value.

The proof of this result is beyond the scope of this book.

Lemma 4.65 (Real eigen values for Hermitian matrices)

The eigenvalues of a Hermitian matrix are real.

Proof. Let A be a Hermitian matrix and let λ be an eigen value of A. Let u be a corresponding eigen vector. Then

Au=λuuHAH=uHλuHAHu=uHλuuHAu=λuHuuHλu=λuHuu22(λλ)=0λ=λ.

Thus λ is real. We used the facts that A=AH and u0u20.

Lemma 4.66 (Orthogonality of eigenvectors for distinct eigenvalues)

Let A be an n×n complex Hermitian matrix. Let λ1 and λ2 be any two distinct eigen values of A and let x1 and x2 be any two corresponding eigen vectors. Then x1 and x2 are orthogonal.

Proof. By definition we have Ax1=λ1x1 and Ax2=λ2x2. Thus

x2HAx1=λ1x2Hx1x1HAHx2=λ1x1Hx2x1HAx2=λ1x1Hx2x1Hλ2x2=λ1x1Hx2(λ1λ2)x1Hx2=0x1Hx2=0.

Thus x1 and x2 are orthogonal. In between we took conjugate transpose on both sides, used the fact that A=AH and λ1λ20.

Definition 4.121 (Unitary diagonalizable matrix)

A complex n×n matrix A is said to be unitary diagonalizable if there exists a unitary matrix U which can diagonalize A; i.e.,

D=UHAU

is a complex diagonal matrix.

Lemma 4.67

Let A be a unitary diagonalizable matrix whose diagonalization D is real. Then A is Hermitian.

Proof. We have a real diagonal matrix D such that

A=UDUH.

Taking conjugate transpose on both sides we get

AH=UDHUH=UDUH=A.

Thus A is Hermitian. We used the fact that DH=D since D is real.

Theorem 4.127

Every Hermitian matrix A is unitary diagonalizable.

We skip the proof of this theorem. The theorem means that if A is Hermitian then A=UΛUH where Λ is a real diagonal matrix.

Definition 4.122 (Eigen value decomposition of a Hermitian matrix)

Let A be an n×n Hermitian matrix. Let λ1,λn be its eigen values such that |λ1||λ2||λn|. Let

Λ=diag(λ1,,λn).

Let U be a unit matrix consisting of orthonormal eigen vectors corresponding to λ1,,λn. Then the eigen value decomposition of A is defined as

A=UΛUH.

If λi are distinct, then the decomposition is unique.

Remark 4.21

Let Λ be a diagonal matrix as in Definition 4.122. Consider some vector xCn.

xHΛx=i=1nλi|xi|2.

Now if λi0 then

xHΛxλ1i=1n|xi|2=λ1x22.

Also

xHΛxλni=1n|xi|2=λnx22.

Lemma 4.68

Let A be a Hermitian matrix with non-negative eigen values. Let λ1 be its largest and λn be its smallest eigen values.

λnx22xHAxλ1x22xCn.

Proof. A has an eigen value decomposition given by

A=UΛUH.
  1. Let xCn and let v=UHx.

  2. Clearly x2=v2.

  3. Then

    xHAx=xHUΛUHx=vHΛv.
  4. From previous remark we have

    λnv22vHΛvλ1v22.
  5. Thus we get

    λnx22xHAxλ1x22.

4.9.14. Miscellaneous Properties#

This subsection lists some miscellaneous properties of eigen values of a square matrix.

Lemma 4.69

λ is an eigen value of A if and only if λ+k is an eigen value of A+kI. Moreover A and A+kI share the same eigen vectors.

Proof. We can see that

Ax=λxAx+kx=λx+kx(A+kI)x=(λ+k)x.

Thus λ is an eigen value of A with an eigen vector x if and only if λ+k is an eigen vector of A+kI with an eigen vector x.

4.9.15. Semidefinite, Definite and Indefinite Matrices#

4.9.15.1. Positive Semidefinite Matrices#

Definition 4.123 (Positive semidefinite matrix)

A symmetric matrix XSn is called positive semi-definite if vTXv0 for all vRn.

The notation XO means vTXv0vRn or the matrix X is positive semi-definite.

We define the set of symmetric positive semidefinite matrices as

S+n={XSn|XO}.

“positive semidefinite” is often abbreviated as “p.s.d.”.

Theorem 4.128

Let ARm×n be an arbitrary matrix. Then, ATA is a p.s.d. matrix in Sn and AAT is a p.s.d. matrix in Sm.

Proof. We shall just prove this for ATA.

  1. We note that

    (ATA)T=ATA.

    Thus, ATA is symmetric.

  2. ATARn×n. Since it is symmetric, hence ATASn.

  3. Let vRn.

  4. Then,

    vTATAv=(Av)T(Av)=Av20

    where is the norm induced by the dot product on Rn.

  5. Thus, ATA is p.s.d..

Theorem 4.129 (Non-negativity of the diagonal elements of p.s.d. matrices)

Let ARn×n be positive semidefinite. Then, its diagonal elements are non-negative.

Proof. Let AS+n.

  1. Then, for every nonzero vRn, vTAv0.

  2. In particular, this is true for standard unit vectors.

  3. But eiTAei=Ai,i.

  4. Hence, Ai,i0 must be true.

Definition 4.124 (Square root of a positive semidefinite matrix)

Let ASn be a positive semidefinite matrix. The square root of A, denoted by A12 is defined as follows.

Let A=UDUT be the eigenvalue decomposition of A. Let d1,,dn be the diagonal elements of D. Let E=diag(d1,,dn). Then,

A12UEUT.

The matrix UEUT is known as the positive semidefinite square root.

The definition of E is justified since di0 for a p.s.d. matrix.

We can see that

çA12=UEUTUEUT=UEEUT=UDUT=A.

4.9.15.2. Positive Definite Matrices#

Definition 4.125 (Positive definite matrix)

A symmetric matrix XSn is called positive definite if vTXv>0 for all nonzero vRn.

The notation XsuccO means vTXv>0vRn,v0 or the matrix X is positive definite.

We define the set of symmetric positive definite matrices as

S++n={XSn|Xsucc0}.

“positive definite” is often abbreviated as “p.d.”.

Theorem 4.130 (Positivity of the diagonal elements of p.d. matrices)

Let ARn×n be positive definite. Then, its diagonal elements are positive.

Proof. Let AS++n.

  1. Then, for every nonzero vRn, vTAv>0.

  2. In particular, this is true for standard unit vectors.

  3. But eiTAei=Ai,i.

  4. Hence, Ai,i>0 must be true.

Theorem 4.131 (Principal minors criterion)

A matrix ASn is positive definite if and only if the determinants of all the principal minors are positive.

In other words, D1(A)>0,D2(A)>0,,Dn(A)>0 where Di(A) denotes the determinant of the upper left i×i submatrix (the i-th principal minor).

4.9.15.3. Negative Semidefinite Matrices#

Definition 4.126 (Negative semidefinite matrix)

A symmetric matrix XSn is called negative semi-definite if vTXv0 for all vRn.

The notation XO means vTXv0vRn or the matrix X is negative semi-definite.

We define the set of symmetric negative semidefinite matrices as

Sn={XSn|XO}.

“negative semidefinite” is sometimes abbreviated as “n.s.d.”.

4.9.15.4. Negative Definite Matrices#

Definition 4.127 (Negative definite matrix)

A symmetric matrix XSn is called negative definite if vTXv<0 for all nonzero vRn.

The notation XO means vTXv<0vRn,v0 or the matrix X is negative definite.

We define the set of symmetric negative definite matrices as

Sn={XSn|X0}.

“negative definite” is sometimes abbreviated as “n.d.”.

4.9.15.5. Indefinite Matrices#

Definition 4.128 (Indefinite matrix)

A symmetric matrix ASn is called indefinite if there exist x,yRn such that xTAx<0 and yTAy>0.

Indefinite matrices are neither positive semidefinite nor negative semidefinite.

Theorem 4.132 (Diagonal elements and indefiniteness)

Let ASn. If the diagonal elements of A are both positive and negative, then A is indefinite.

Proof. Each diagonal element corresponds to eiTAei for a unit vector ei. If the diagonal elements are both positive and negative, then there exist unit vectors ei and ej such that eiTAei<0 and ejTAej>0. Thus, A is indefinite.

4.9.15.6. Eigenvalue Decomposition#

Theorem 4.133 (Eigenvalue characterization theorem)

Let ASn be an n×n symmetric matrix.

  1. A is positive definite if and only if all its eigenvalues are positive.

  2. A is positive semidefinite if and only if all its eigenvalues are nonnegative.

  3. A is negative definite if and only if all its eigenvalues are negative.

  4. A is negative semidefinite if and only if all its eigenvalues are nonpositive.

  5. A is indefinite if and only if at least one eigenvalue is positive and at least one eigenvalue is negative.

Proof. Let the eigenvalue decomposition of A be given by A=UDUT.

We prove (1).

  1. Let vRn.

  2. Then

    vTAv=vTUDUTv=wTDw=i=1ndiwi2.

    where w=UTv.

  3. Since U is nonsingular, hence vTAv>0 for every v0 if and only if i=1ndiwi2>0 for every w0.

  4. Plugging ei for w, we see that di>0 is a necessary condition.

  5. Also, if di>0 for every i, then, the sum is positive for every nonzero w since at least one wi0. Hence, it is a sufficient condition.

Similar arguments apply for other statements.

4.9.15.7. Trace and Determinant#

Corollary 4.22 (Trace and determinant of positive definite matrices)

Let A be a symmetric positive definite matrix. Then, tr(A) and det(A) are positive.

Proof. Trace is the sum of eigenvalues. Determinant is the product of eigenvalues. If A is symmetric positive definite, then its eigenvalues are positive. Hence, trace and determinant are positive.

Corollary 4.23 (Trace and determinant of positive semidefinite matrices)

Let A be a symmetric positive definite matrix. Then, tr(A) and det(A) are nonnegative.

4.9.16. Diagonally Dominant Matrices#

Definition 4.129 (Diagonally dominant matrix)

Let AFn×n with A=[aij].

  1. A is called diagonally dominant if

    |aii|j,ji|aij|

    for every i=1,,n. In other words, the absolute value of the diagonal entry in a row is greater than or equal to the sum of absolute values of non diagonal entries in the row.

  2. A is called strictly diagonally dominant if

    |aii|>j,ji|aij|

    for every i=1,,n. In other words, the absolute value of the diagonal element is bigger than the sum of absolute values of all the off diagonal elements on that row.

Example 4.28 (Strictly diagonally dominant matrix)

Let us consider

A=[42104720349121315].

We can see that the strict diagonal dominance condition is satisfied for each row as follows:

 row 1:|4|>|2|+|1|+|0|=3 row 2:|7|>|4|+|2|+|0|=6 row 3:|9|>|3|+|4|+|1|=8 row 4:|15|>|2|+|1|+|3|=6.

Theorem 4.134 (Positive semidefiniteness of diagonally dominant matrices)

Let ASn be a real symmetric matrix.

  1. If A is diagonally dominant whose diagonal entries are nonnegative then A is positive semidefinite.

  2. If A is strictly diagonally dominant whose diagonal entries are positive then A is positive definite.

Proof. Assume that A is diagonally dominant with nonnegative diagonal entries.

  1. For contradiction, assume that A is not positive semidefinite.

  2. Then, there is an eigen value λ<0 and corresponding eigenvector u.

  3. Consider the absolute values of entries of u given by (|u1|,,|un|).

  4. Let i1,,n denote the index of the largest absolute value entry of u.

  5. We also have Au=λu.

  6. For the i-th row, we get the equality

    j=1nAijuj=λuijiAijuj=λuiAiiui|jiAijuj|=|λAii||ui||λAii||ui|(ji|Aij|)|ui||Aii||ui|.
  7. Thus, |λAii|=|Aiiλ||Aii|.

  8. But Aii is nonnegative and λ is negative, hence this reduces to AiiλAii or λ0.

  9. We have arrived at a contradiction.

  10. Thus, A must be positive semidefinite.

Now, assume that A is strictly diagonally dominant with positive diagonal entries.

  1. By first part, it is clear that A is p.s.d..

  2. We just need to show that all eigenvalues are positive. There are no zero eigenvalues.

  3. For contradiction, assume that 0 is indeed an eigenvalue of A.

  4. Let u0 be corresponding eigenvector satisfying Au=0.

  5. Let i1,,n denote the index of the largest absolute value entry of u.

  6. Following the earlier argument

    |Aii||ui|=|jiAijuj|(ji|Aij|)|ui|<|Aii||ui|.
  7. This is impossible. Hence, A must be positive definite.

Strictly diagonally dominant matrices have a very special property. They are always non-singular. The following result is valid for both real and complex strictly diagonally dominant matrices.

Theorem 4.135

Strictly diagonally dominant matrices are non-singular.

Proof. For contradiction, suppose that A is strictly diagonally dominant and singular.

  1. Then there exists a vector uCn with u0 such that

    Au=0.
  2. Let

    u=(u1u2un).
  3. We first show that every entry in u cannot be equal in magnitude.

  4. For contradiction, let us assume that there exists c>0 such that

    c=|u1|=|u2|==|un|.
  5. Since u0 hence c0.

  6. We can write uj=cexp(iθj) for every j=1,,n.

  7. Now for any row i in Au=0, we have

    j=1naijuj=0j=1naijcexp(iθj)=0j=1naijexp(iθj)=0aiiexp(iθj)=jiaijexp(iθj)|aii|=|jiaijexp(iθj)||aii|ji|aij|

    by triangle inequality.

  8. But this contradicts our assumption that A is strictly diagonally dominant.

  9. Thus all entries in u are not equal in magnitude.

  10. Let us now assume that the largest entry in u lies at index i with |ui|=c.

  11. Without loss of generality we can scale down u by c to get another vector in which all entries are less than or equal to 1 in magnitude while i-th entry has a magnitude of 1.

  12. In other words, |ui|=1 and |uj|1 for all other entries.

  13. From Au=0 we get for the i-th row

    j=1naijuj=0 aiiui=jiujaij|aiiui|=|jiujaij||aii|ji|ujaij|ji|aij|

    which again contradicts our assumption that A is strictly diagonally dominant.

  14. Hence strictly diagonally dominant matrices are non-singular.

4.9.17. Gershgorin’s Theorem#

We are now ready to examine Gershgorin’ theorem which provides very useful bounds on the spectrum of a square matrix.

Theorem 4.136 (Gershgorin’s circle theorem)

Every eigen value λ of a square matrix ACn×n satisfies

(4.13)#|λaii|j,ji|aij| for some i{1,2,,n}.

Proof. The proof is a straight forward application of non-singularity of diagonally dominant matrices.

  1. We know that for an eigen value λ, det(λIA)=0

  2. In other words, the matrix (λIA) is singular.

  3. Hence it cannot be strictly diagonally dominant due to Theorem 4.135.

  4. Thus looking at each row i of (λIA) we can say that

    |λaii|>j,ji|aij|

    cannot be true for all rows simultaneously.

  5. In other words, this condition must fail at least for one row.

  6. This means that there exists at least one row i for which

    |λaii|j,ji|aij|

    holds true.

What this theorem means is pretty simple.

  1. Consider a disc in the complex plane for the i-th row of A whose center is given by aii and whose radius is given by r=ji|aij| i.e. the sum of magnitudes of all non-diagonal entries in i-th row.

  2. There are n such discs corresponding to n rows in A.

  3. (4.13) means that every eigen value must lie within the union of these discs. It cannot lie outside.

This idea is crystallized in following definition.

Definition 4.130 (Gershgorin’s disc)

For i-th row of the square matrix A we define the radius ri=j,ji|aij| and the center ci=aii. Then the set given by

Di={zC||zaii|ri}

is called the i-th Gershgorin’s disc of A.

We note that the definition is equally valid for real as well as complex matrices. For real matrices, the centers of disks lie on the real line. For complex matrices, the centers may lie anywhere in the complex plane.

Clearly there is nothing extraordinary about the rows of A. We can as well consider the columns of A.

Theorem 4.137 (Gershgorin’s circle theorem for columns)

Every eigen value of a matrix A must lie in a Gershgorin disc corresponding to the columns of A where the Gershgorin disc for j-th column is given by

Dj={zC||zajj|rj}

with

rj=i,ij|aij|

Proof. We know that eigen values of A are same as eigen values of AT and columns of A are nothing but rows of AT. Hence eigen values of A must satisfy conditions in Theorem 4.136 w.r.t. the matrix AT. This completes the proof.