• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!


Eigenvalues and Eigenvectors

Page history last edited by sidjaggi 12 years, 8 months ago

Eigenvalues and eigenvectors[1]


0 Questions to make you think (quick version of this chapter)



1 Eigenvalues and eigenvectors:

 We saw in the previous chapter that there are an infinite number of bases for vector spaces. In this chapter, we'll see that with respect to particular linear transformations, there are "natural" bases.


• It estimated that 3,000 students and staff in CUHK eat lunch every day, but there are only three restaurants on campus, Maxim's, Cafe de Coral and Fairwood. If people have lunch at one of the three restaurants, then they will have lunch at any of the other two with same probability on the following day. For instance, if people have lunch at Fairwood today, half of them will eat at Maxim's and the other half will eat at Cafe de Coral tomorrow. If we start with all 3,000 customers at Maxim's on the first day, then after one day what do you expect to observe? How can you present it mathematically? (in matrix form)
• How about after n days? Can you represent your answer mathematically? What can you observe if you try simulating this using Maxima? Does it matter what the starting configuration is (how many people eat at each restaurant on the first day)? How would you algebraically represent the "equilibrium state"?
• How about other scenarios, with more restaurants, and other ``transition probabilities"? 

• Abstraction: Suppose the direction (but not necessarily the magnitude) of a vector remains unchanged after it undergoes a linear transformation with matrix M, how would you write this algebraically? (Such a vector is called the eigenvector (derived from the German word for "inherent"), and the corresponding change in magnitude is called the eigenvalue.)


• Another example: Suppose a matrix M is known to rotate the vectors in a vector space. What would the eigenvector correspond to? The corresponding eigenvalue?


• Can you think of interesting examples where the eigenvalue is not 1?


• How would you actually find the eigenvalues, and corresponding eigenvectors?


• How many eigenvalues must there be for a matrix? Do they all have to be real? Distinct? How about eigenvectors -- how many eigenvectors are there per eigenvalue?


• Writing a transformation in its eigenbasis ("rotation matrix", followed by a diagonal matrix, followed by the inverse rotation matrix) has some advantages. Can you think of some? Can you think of the geometric interpretation of writing am trix in such a format?



Definitions: Given a square n by n matrix A, the corresponding eigenvalue value is 



Here Formula is an eigenvector, and Formula is the corresponding eigenvalue. Essentially, this equation examines the question -- which vector(s) are unchanged (in direction) when they undergo the linear transformation A? We shall see applications of eigenvector and eigenvalues below.



Finding eigenvalues: To derive the set of eigenvalues of a matrix, we note that (1) can be rewritten as Formula, where I is the identity matrix of appropriate dimensions. This implies that



But (2) happens if and only if x is in the null-space of Formula. Since Formula, this means that 


The equation (3) is a polynomial of degree n in Formula, called its characteristic equation.

Hence Formula is an eigenvalue of A if and only if (3) is satisfied.

Note 1: Since (3) is a polynomial in Formula of degree at most n (the dimension of the matrix A), therefore the matrix has at most n eigenvalues (though some may be repeated roots of (3)).


Note 2: By the properties of matrices, the sum of the eigenvalues of a matrix equals its trace, and the product of eigenvalues of a matrix equal its determinant.



Example 1: Consider the matrix Formula. Then Formula.

Hence the characteristic equation is Formula, and Formula are the two eigenvalues of A. Note that these values satisfy Note 2 above.


Exercise 1: Prove Note 2.


Note 3: A real matrix may have complex eigenvalues, since the roots of a real polynomial way well be complex. (Find such a matrix!)


Note 4: Since the determinant of the transpose of a matrix equals the determinant of the matrix, hence the eigenvalues of the transpose of a matrix equal the eigenvalues of the matrix itself.



Finding eigenvectors: Given that the eigenvalues of a matrix have been determined as above, we now focus on finding the set of corresponding eigenvectors. Given an eigenvalue Formula, the corresponding eigenvector(s) x satisfy the equation (2), and are thus in the null-space of the matrix Formula, as in previous worksheets.


Example 2: For the matrix A from Example 1, we find the eigenvectors corresponding to the eigenvalues -1 and -6 respectively.

For Formula, we note that (2) becomes Formula. Examining these two equations, we notice that they are both equivalent to the equation Formula. Thus the eigenvector (up to constant scaling) corresponding to the eigenvalue -1 is Formula.

Similarly, corresponding to Formula the equation (2) becomes Formula. Both equations corresponding to this matrix equation are equivalent to Formula, and hence the eigenvector (up to constant scaling) corresponding to the eigenvalue -6 is Formula.




Multiplicity of eigenvalues: The characteristic equation for some matrices might have repeated roots. The number of times an eigenvalue is repeated is called its algebraic multiplicity. For each eigenvalue, the number of linearly independent eigenvectors is called the geometric multiplicity. A fact, stated without proof, is that the geometric multiplicity of each eigenvalue is always at least one, and always at most its algebraic multiplicity but may indeed sometimes be less than it.


See the following for examples.


Example 3: Let Formula.

(i) For A, the characteristic equation equals Formula (verify!). The roots of this are 5, -3, and -3 (verify!). Hence the algebraic multiplicity of the eigenvalue 5 is 1, and the algebraic multiplicity of the eigenvalue -3 is 2.

Next, it can be verified (do so!) that corresponding to the eigenvalue 5, the corresponding eigenvector (up to scaling) equals Formula.

Also, it can be verified (do so!) that both Formula and Formula are eigenvectors corresponding to the eigenvalue -3, and that they are linearly independent.

Hence the algebraic multiplicities of each eigenvalue of A equal their geometric multiplicities.


(ii) For B, it can be verified that the characteristic equation equals Formula. But it can be verified that the only linearly independent eigenvector (up to scaling) corresponding to this eigenvalue correspond to Formula.



Eigenbasis: If the algebraic multiplicity of each eigenvalue equals its geometric multiplicity, then it is said to have an eigenbasis. That is, one can write the matrix A as a diagonal matrix in the basis comprising of its eigenvectors. Hence, if Formula is a diagonal matrix comprising of the eigenvalues of A, and U is a matrix whose columns are the eigenvectors corresponding to the respective eigenvalues in Formula, then we can diagonalize A as



Example 4: Using the results from the first two examples of this worksheet, we have  Formula. (Verify!)


Exercise 2: Use the results from prior examples to diagonalize Formula.


Exercise 3: Prove equation (4).




Applications: Eigenvectors are often the "right" choice of basis for a linear transformation. Consider the following examples.


Example 5: Suppose a weather forecaster uses the following (very simple) model to predict the weather. Let Formula denote the probability transition matrix of the weather in the following sense -- the first column represents sunny weather now, the second column represents cloudy weather now, and the third column represents rainy weather now. Similarly, the first row represents sunny weather tomorrow, the second row represents cloudy weather tomorrow, and the third row represents rainy weather tomorrow. The entries of the matrix then represent the probability that the weather tomorrow is sunny, cloudy, or rainy, given the weather today. For instance, the entries along the diagonal represent the probabilities that the weather is unchanged, the entry Formula=0.2 represents the probability that the weather tomorrow will be cloudy given that the weather today is sunny, etc.


If the weather truly followed such a model, one question would be to estimate the steady-state distribution of the weather. That is, far in the future, we would wonder whether the probabilities of sunny, cloudy, and rainy weather to converge to some fixed values.


Suppose the vector representing the current weather equals x. (It is a unit vector, with a 1 in the first, second or third place depending on whether it is current sunny, cloudy, or rainy.) Then, the probability distribution of the weather n days in the future is given by Anx. But by (4), Formula. Hence Formula.

But it can be verified that Formula, and the corresponding eigenbasis equals Formula. But for large n, Formula converges to Formula. Hence, for large n, Formula, independent of x! Hence, in the long run, one expects the probability of sunny weather to be 1/6, cloudy weather to be 3/6, and rainy weather to be 2/6.


(This exponentiation trick of matrices via diagonalization is very useful in general!)


Another way to understand this is as follows. Let y correspond to the steady state probability distribution. In that case, we expect y to satisfy the matrix equation Ay=y. Hence we expect 1 to be an eigenvalue of the matrix (since both y and Formula are probability vectors and must sum to one), and the corresponding eigenvector would correspond to the steady-state distribution we are interested in. One can compute (do so!) the corresponding eigenvector as Formula. .



Example 6: Consider the two-mass two-spring system in the figure below.

Let the two masses m1 = m2 = 1, and the two spring constants be k2=3, k1=2. Let y1 and y2 denote the actual displacements (in the downward direction) of the two masses m2 and m1 respectively. In this case, note two situations that are different from the "usual" single-spring setup. Initially, the bottom mass is slowly pulled down by 1 metre and gently released.

  • For one, the amount by which the lower spring changes its length is actually y2-y1, and hence the net force on the lower mass equals -2(y2-y1).
  • Secondly, both springs are acting on the upper mass. The force due to the upper spring equals -3y1, and the force due to the lower spring equals 2(y2-y1), for a net force of -5y1+ 2y2.

(Note carefully the signs of the force terms.)

Hence the system of ODEs governing the motion of the bodies is


These can be written as Formula, where Formula, and Formula by Example 4 above.


This means that the system of ODEs can be written as 


Since the matrix Formula is invertible, multiplying both sides of the equation with its inverse gives us

Formula. (5)

Now let use define Formula. Hence Formula.

Substituting these two equations into (5) above gives us

Formula. In other words, the system of ODEs has been diagonalized, or decoupled in the new variables z1 and z2. Now, by the "usual" techniques, we can solve the two equations Formula and Formula to obtain Formula and Formula. Noting that Formula gives us that



To solve this to get the particular solution, we note that we need four initial conditions to resolve the values of the four parameters A, B, C and D. Three of these initial conditions should be readily apparent, since it is stated "the lower mass is slowly pulled down one metre and gently released." Hence, y2(0)=1 (initial displacement) and y1'(0)=y2'(0)=0 (initial velocity). To obtain the fourth initial condition, we note that we have an extra constraint on the upper mass -- when the lower mass is pulled down by 1 metre, the lower spring also pulls down the upper mass. The two forces on the upper mass should balance, hence k2y1(0)= k1(y2(0)-y1(0)), and hence 3y1(0) = 2(1-y1(0)), and hence y1(0)=2/5. Finally, the resolution of A, B, C, and D from these initial conditions is left as an exercise for the gentle reader...


Rotations: Eigenvalues and eigenvectors also help us compute rotations.



http://xkcd.com/184/ [2]


A rotation is defined as a length and angle-preserving linear transformation. That is, a matrix A represents a rotation if for any two vectors u and v, the dot-product between u and v is the same as the dot-product between Au and Av. Then

Theorem: ATA=AAT=I.

Proof: The dot-products u.v = uTv, and (Au).(Av) = (Au)T(Av) = uTATAv. Hence uTATAv = uTv for all vectors u and v.

In particular, consider vectors u and v such that u is the unit vector with a 1 in the i-th location and 0 elsewhere, and v is the unit vector with a 1 in the j-th location and 0 elsewhere.

Then uTATAv is exactly the entry in the i-th row and the j-th column of ATA (verify!).

But uTv is then 1 if and only if i=j, otherwise it equals 0 (verify!).

Hence ATA must have 1's along the diagonal, and zeroes elsewhere, and is thus the identity matrix.  



Example 7: Consider the matrix Formula.


It can be verified (do so!) that ATA=I. Hence this A is a rotation matrix.


Now, we are interested in finding the axis of rotation -- i.e., the vector that DOES NOT rotate under this linear transform.


We do the following calculations with MAXIMA:



MAXIMA code: A:matrix( [0.36,0.48,-0.8], 





Output: [[[-(24*%i-7)/25,(24*%i+7)/25,1],[1,1,1]],[[[1,-(3*%i-1)/4,(3*%i+1)/4]],[[1,(3*%i+1)/4,-(3*%i-1)/4]],[[1,-2,-2]]]]


The first 3 numbers -(24i-7)/25, (24i+7)/25, and 1, are respectively the eigenvalues of the matrix.

The next 3 numbers are their algebraic multiplicities (since they each occur once).

Finally, the next set of 3 vectors are respectively the corresponding eigenvectors.

[1,-(3i-1)/4,(3i+1)/4],   [1,(3i+1)/4,-(3i-1)/4], and [1,-2,-2].


Note that 1 is an eigenvalue, and its corresponding eigenvector equals v = [1,-2,-2]. This means that any vector in the direction of v is unchanged under the operations of A. Hence the axis of rotation must be [1,-2,-2]. (Note that the other two eigenvectors contain complex entries, and hence are "not relevant".[3]).




  1. Examples from Kreyszig
  2. Image courtesy XKCD http://xkcd.com/184/
  3. Actually, the other two eigenvectors are ALSO axes of rotation, but only in the six-dimensional complex space corresponding to 3 real and 3 imaginary coordinates. If you dont' understand, don't worry about it :) But if you have questions, ASK!

Comments (0)

You don't have permission to comment on this page.