**Eigenvalues and eigenvectors**

**0 Questions to make you think (quick version of this chapter)**

** **

**1 Eigenvalues and eigenvectors:**

** **We saw in the previous chapter that there are an infinite number of bases for vector spaces. In this chapter, we'll see that with respect to particular linear transformations, there are "natural" bases.

• It estimated that 3,000 students and staff in CUHK eat lunch every day, but there are only three restaurants on campus, Maxim's, Cafe de Coral and Fairwood. If people have lunch at one of the three restaurants, then they will have lunch at any of the other two with same probability on the following day. For instance, if people have lunch at Fairwood today, half of them will eat at Maxim's and the other half will eat at Cafe de Coral tomorrow. If we start with all 3,000 customers at Maxim's on the first day, then after one day what do you expect to observe? How can you present it mathematically? (in matrix form)

• How about after n days? Can you represent your answer mathematically? What can you observe if you try simulating this using Maxima? Does it matter what the starting configuration is (how many people eat at each restaurant on the first day)? How would you algebraically represent the "equilibrium state"?

• How about other scenarios, with more restaurants, and other ``transition probabilities"?

• Abstraction: Suppose the direction (but not necessarily the magnitude) of a vector remains unchanged after it undergoes a linear transformation with matrix M, how would you write this algebraically? (Such a vector is called the eigenvector (derived from the German word for "inherent"), and the corresponding change in magnitude is called the eigenvalue.)

• Another example: Suppose a matrix M is known to rotate the vectors in a vector space. What would the eigenvector correspond to? The corresponding eigenvalue?

• Can you think of interesting examples where the eigenvalue is not 1?

• How would you actually find the eigenvalues, and corresponding eigenvectors?

• How many eigenvalues must there be for a matrix? Do they all have to be real? Distinct? How about eigenvectors -- how many eigenvectors are there per eigenvalue?

• Writing a transformation in its eigenbasis ("rotation matrix", followed by a diagonal matrix, followed by the inverse rotation matrix) has some advantages. Can you think of some? Can you think of the geometric interpretation of writing am trix in such a format?

**Definitions:** Given a square n by n matrix A, the corresponding eigenvalue value is

.

Here is an *eigenvector*, and is the corresponding *eigenvalue*. Essentially, this equation examines the question -- which vector(s) are unchanged (in direction) when they undergo the linear transformation A? We shall see applications of eigenvector and eigenvalues below.

**Finding eigenvalues:** To derive the set of eigenvalues of a matrix, we note that (1) can be rewritten as , where I is the identity matrix of appropriate dimensions. This implies that

.

But (2) happens if and only if x is in the null-space of . Since , this means that

.

The equation (3) is a polynomial of degree n in , called its *characteristic equation*.

Hence is an eigenvalue of A if and only if (3) is satisfied.

**Note 1**: Since (3) is a polynomial in of degree at most n (the dimension of the matrix A), therefore the matrix has at most n eigenvalues (though some may be repeated roots of (3)).

**Note 2**: By the properties of matrices, the sum of the eigenvalues of a matrix equals its trace, and the product of eigenvalues of a matrix equal its determinant.

**Example 1**: Consider the matrix . Then .

Hence the characteristic equation is , and are the two eigenvalues of A. Note that these values satisfy Note 2 above.

**Exercise 1**: Prove Note 2.

**Note 3**: A real matrix may have complex eigenvalues, since the roots of a real polynomial way well be complex. (Find such a matrix!)

**Note 4**: Since the determinant of the transpose of a matrix equals the determinant of the matrix, hence the eigenvalues of the transpose of a matrix equal the eigenvalues of the matrix itself.

**Finding eigenvectors: **Given that the eigenvalues of a matrix have been determined as above, we now focus on finding the set of corresponding eigenvectors. Given an eigenvalue , the corresponding eigenvector(s) **x** satisfy the equation (2), and are thus in the null-space of the matrix , as in previous worksheets.

**Example 2**: For the matrix A from Example 1, we find the eigenvectors corresponding to the eigenvalues -1 and -6 respectively.

For , we note that (2) becomes . Examining these two equations, we notice that they are both equivalent to the equation . Thus the eigenvector (up to constant scaling) corresponding to the eigenvalue -1 is .

Similarly, corresponding to the equation (2) becomes . Both equations corresponding to this matrix equation are equivalent to , and hence the eigenvector (up to constant scaling) corresponding to the eigenvalue -6 is .

**Multiplicity of eigenvalues**: The characteristic equation for some matrices might have repeated roots. The number of times an eigenvalue is repeated is called its *algebraic multiplicity*. For each eigenvalue, the number of linearly independent eigenvectors is called the *geometric multiplicity*. A fact, stated without proof, is that the geometric multiplicity of each eigenvalue is always at least one, and always at most its algebraic multiplicity but may indeed sometimes be less than it.

See the following for examples.

**Example 3**: Let .

(i) For A, the characteristic equation equals (verify!). The roots of this are 5, -3, and -3 (verify!). Hence the algebraic multiplicity of the eigenvalue 5 is 1, and the algebraic multiplicity of the eigenvalue -3 is 2.

Next, it can be verified (do so!) that corresponding to the eigenvalue 5, the corresponding eigenvector (up to scaling) equals .

Also, it can be verified (do so!) that both and are eigenvectors corresponding to the eigenvalue -3, and that they are linearly independent.

Hence the algebraic multiplicities of each eigenvalue of A equal their geometric multiplicities.

(ii) For B, it can be verified that the characteristic equation equals . But it can be verified that the only linearly independent eigenvector (up to scaling) corresponding to this eigenvalue correspond to .

**Eigenbasis**: If the algebraic multiplicity of each eigenvalue equals its geometric multiplicity, then it is said to have an *eigenbasis*. That is, one can write the matrix A as a diagonal matrix in the basis comprising of its eigenvectors. Hence, if is a diagonal matrix comprising of the eigenvalues of A, and U is a matrix whose columns are the eigenvectors corresponding to the respective eigenvalues in , then we can *diagonalize* A as

**Example 4**: Using the results from the first two examples of this worksheet, we have . (Verify!)

**Exercise 2**: Use the results from prior examples to diagonalize .

**Exercise 3**: Prove equation (4).

**Applications**: Eigenvectors are often the "right" choice of basis for a linear transformation. Consider the following examples.

**Example 5**: Suppose a weather forecaster uses the following (very simple) model to predict the weather. Let denote the *probability transition matrix* of the weather in the following sense -- the first column represents sunny weather now, the second column represents cloudy weather now, and the third column represents rainy weather now. Similarly, the first row represents sunny weather tomorrow, the second row represents cloudy weather tomorrow, and the third row represents rainy weather tomorrow. The entries of the matrix then represent the probability that the weather tomorrow is sunny, cloudy, or rainy, given the weather today. For instance, the entries along the diagonal represent the probabilities that the weather is unchanged, the entry =0.2 represents the probability that the weather tomorrow will be cloudy given that the weather today is sunny, etc.

If the weather truly followed such a model, one question would be to estimate the *steady-state distribution* of the weather. That is, far in the future, we would wonder whether the probabilities of sunny, cloudy, and rainy weather to converge to some fixed values.

Suppose the vector representing the current weather equals **x**. (It is a unit vector, with a 1 in the first, second or third place depending on whether it is current sunny, cloudy, or rainy.) Then, the probability distribution of the weather *n* days in the future is given by A^{n}**x**. But by (4), . Hence .

But it can be verified that , and the corresponding eigenbasis equals . But for large *n*, converges to . Hence, for large n, , *independent of ***x**! Hence, in the long run, one expects the probability of sunny weather to be 1/6, cloudy weather to be 3/6, and rainy weather to be 2/6.

(*This exponentiation trick of matrices via diagonalization is very useful in general!*)

Another way to understand this is as follows. Let **y** correspond to the steady state probability distribution. In that case, we expect **y **to satisfy the matrix equation A**y=****y. **Hence we expect 1 to be an eigenvalue of the matrix (since both **y **and are probability vectors and must sum to one), and the corresponding eigenvector would correspond to the steady-state distribution we are interested in. One can compute (do so!) the corresponding eigenvector as . .

**Example 6**: Consider the two-mass two-spring system in the figure below.

Let the two masses m_{1} = m_{2} = 1, and the two spring constants be k_{2}=3, k_{1}=2. Let y_{1} and y_{2} denote the actual displacements (in the downward direction) of the two masses m_{2} and m_{1} respectively. In this case, note two situations that are different from the "usual" single-spring setup. Initially, the bottom mass is slowly pulled down by 1 metre and gently released.

- For one, the amount by which the lower spring changes its length is actually y
_{2}-y_{1}, and hence the net force on the lower mass equals -2(y_{2}-y_{1}).
- Secondly, both springs are acting on the upper mass. The force due to the upper spring equals -3y
_{1}, and the force due to the lower spring equals 2(y_{2}-y_{1}), for a net force of -5y_{1}+ 2y_{2}.

(Note carefully the signs of the force terms.)

Hence the system of ODEs governing the motion of the bodies is

.

These can be written as , where , and by Example 4 above.

This means that the system of ODEs can be written as

.

Since the matrix is invertible, multiplying both sides of the equation with its inverse gives us

. (5)

Now let use define . Hence .

Substituting these two equations into (5) above gives us

. In other words, the system of ODEs has been *diagonalized*, or *decoupled* in the new variables z_{1} and z_{2}. Now, by the "usual" techniques, we can solve the two equations and to obtain and . Noting that gives us that

.

To solve this to get the particular solution, we note that we need four initial conditions to resolve the values of the four parameters A, B, C and D. Three of these initial conditions should be readily apparent, since it is stated "the lower mass is slowly pulled down one metre and gently released." Hence, y_{2}(0)=1 (initial displacement) and y_{1}'(0)=y_{2}'(0)=0 (initial velocity). To obtain the fourth initial condition, we note that we have an extra constraint on the upper mass -- when the lower mass is pulled down by 1 metre, the lower spring also pulls down the upper mass. The two forces on the upper mass should balance, hence k_{2}y_{1}(0)= k_{1}(y_{2}(0)-y_{1}(0)), and hence 3y_{1}(0) = 2(1-y_{1}(0)), and hence y_{1}(0)=2/5. Finally, the resolution of A, B, C, and D from these initial conditions is left as an exercise for the gentle reader...

**Rotations**: Eigenvalues and eigenvectors also help us compute rotations.

http://xkcd.com/184/

A rotation is defined as a *length and angle-preserving linear transformation*. That is, a matrix A represents a rotation if for any two vectors u and v, the dot-product between u and v is the same as the dot-product between Au and Av. Then

*Theorem*: A^{T}A=AA^{T}=I.

*Proof*: The dot-products u.v = u^{T}v, and (Au).(Av) = (Au)^{T}(Av) = u^{T}A^{T}Av. Hence u^{T}A^{T}Av = u^{T}v for all vectors u and v.

In particular, consider vectors u and v such that u is the unit vector with a 1 in the i-th location and 0 elsewhere, and v is the unit vector with a 1 in the j-th location and 0 elsewhere.

Then u^{T}A^{T}Av is exactly the entry in the i-th row and the j-th column of A^{T}A (verify!).

But u^{T}v is then 1 if and only if i=j, otherwise it equals 0 (verify!).

Hence A^{T}A must have 1's along the diagonal, and zeroes elsewhere, and is thus the identity matrix.

**Example 7**: Consider the matrix .

It can be verified (do so!) that A^{T}A=I. Hence this A is a rotation matrix.

Now, we are interested in finding the *axis of rotation* -- i.e., the vector that DOES NOT rotate under this linear transform.

We do the following calculations with MAXIMA:

*MAXIMA code: A:matrix( [0.36,0.48,-0.8], *

* [-0.8,0.6,0], *

* [0.48,0.64,0.60]);*

*eigenvectors(A);*

Output: *[[[-(24*%i-7)/25,(24*%i+7)/25,1],[1,1,1]],[[[1,-(3*%i-1)/4,(3*%i+1)/4]],[[1,(3*%i+1)/4,-(3*%i-1)/4]],[[1,-2,-2]]]]*

The first 3 numbers -(24i-7)/25, (24i+7)/25, and 1, are respectively the eigenvalues of the matrix.

The next 3 numbers are their algebraic multiplicities (since they each occur once).

Finally, the next set of 3 vectors are respectively the corresponding eigenvectors.

[1,-(3i-1)/4,(3i+1)/4], [1,(3i+1)/4,-(3i-1)/4], and [1,-2,-2].

Note that 1 is an eigenvalue, and its corresponding eigenvector equals v = [1,-2,-2]. This means that any vector in the direction of v is unchanged under the operations of A. Hence the axis of rotation must be [1,-2,-2]. (Note that the other two eigenvectors contain complex entries, and hence are "not relevant".).

**完**

* *

## Comments (0)

You don't have permission to comment on this page.