In this section we will see how we can better understand a linear transformation by breaking it down into
Let be a linear transformation from to . Suppose that is a basis of , that is the span of some of the vectors in , and that is the span of the remaining vectors in . Then any vector in can be written as the sum of a vector in and a vector in . Since , we can see how behaves on all of if we know how it behaves on and on . This decomposition is particularly helpful if and are chosen so that behaves in a simple way on and on .
Given such a decomposition of into the vector spaces and , we can apply the same idea to split and into lower-dimensional vector spaces and repeat until no more splits are possible. The most optimistic outcome of this procedure would be that we get all the way down to one-dimensional subspaces and that acts on each of these subspaces by simply scaling each vector in that subspace by some factor. In other words, we would like to find vectors for which is a scalar multiple of . This leads us to the following definition.
An eigenvector of an matrix is a nonzero vector with the property that for some (in other words, maps to a vector which is either zero or parallel to ). We call an eigenvalue of , and we call the eigenvector together with its eigenvalue an eigenpair.
Every nonzero vector is an eigenvector (with eigenvalue
Find a matrix with eigenpairs and . Sketch the images of some gridlines under multiplication by this matrix to show how it scales space along the lines through its eigenvectors. Verbally describe your results below.
Solution. Writing out the equations implied by the given eigenpair relations, we see that the first implies that the first column of the matrix is , and the second (together with the first) implies that the second column of the matrix is .
The following gridline images show how the transformation distorts space. Equally spaced points which are separated in the east-west direction get spread out by a factor of 2, while the diagonal line gets stretched out by a factor of 3. Since , this introduces a bottom-left-to-top-right tilt for the images of the vertical gridlines.
If are eigenvectors of with the same eigenvalue and for some weights such that for at least one then is also an eigenvector of with eigenvalue because
Therefore, the set of all eigenvectors corresponding to a particular eigenvalue form a
Let be a matrix, with eigenvectors and , both with eigenvalue . Find .
Solution. Since we find that
Let be a subspace spanned by the eigenvectors of a matrix If which of the following are necessarily true?
Solution. Let be the eigenvectors of that span and let be the corresponding eigenvalues. Then admits a representation Since
we see that is also in This means (2) is not true in general. Option need not always hold. For instance, it fails to hold if and and are both non-zero and not equal. Therefore the only true statement is (1)
Suppose is a matrix with a eigenvector whose eigenvalue is 2 and an eigenvector whose eigenvalue is 2. Let . Explain why
Solution. Let be an integer. Then by the eigenvalue equation, we have
When is large, the first term is much larger than the second. Writing the squared norm as a dot product and distributing, we get
Since we find that
If an matrix has linearly independent eigenvectors, then we can think of the one-dimensional subspaces spanned by each of these vectors as (not necessarily orthogonal) axes along which acts by scaling.
In matrix terms, we can define to be the matrix with the eigenvectors of as columns. Then from the definition of an eigenpair, we have
where is a matrix whose diagonal entries are the eigenvalues (in order corresponding to the columns of ) and whose other entries are zero. By the invertible matrix theorem, the assumption about 's columns being linearly independent implies that is invertible, so we find that , where is a diagonal matrix, and we say that is diagonalizable.
Some matrices are not diagonalizable, because they correspond to geometric transformations that cannot be viewed as scaling along any set of axes. Use this geometric intuition to come up with a matrix which is not diagonalizable.
Solution. Rotation matrices in (except for 0 degree rotations and 180-degree rotations) are not diagonalizable. For example, the 90-degree rotation matrix
does not send any nonzero vector to a multiple of itself.
Suppose that we have diagonalized as . Then is equal to
Let be another matrix, with eigenpairs and . Let . Which of the following is equal to ?
Solution. We have
because is the identity matrix. Similarly,
By linearity But and because and are eigenvectors of Therefore
Positive definite matrices
A positive definite matrix is a symmetric matrix whose eigenvalues are all positive. A positive semidefinite matrix is a symmetric matrix whose eigenvalues are all nonnegative. Equivalently, a symmetric matrix is positive semidefinite if for all .
Negative definite and negative semidefinite matrices are defined analogously.
(i) Is the sum of two positive definite matrices necessarily positive definite?
Solution. (i) If and are positive definite matrices, then is also positive definite because
for any vector
The Gram matrix
If is an matrix, then is its Gram matrix. The Gram matrix of is always positive semidefinite:
Let be a Gram matrix, and let be a vector. Which of the following is equal to ?
Using your answer above, explain why a Gram matrix is always positive semidefinite, but not necessarily positive definite.
Solution. The correct answer is because
From this we see that the Gram matrix is positive semidefinite because Since it is possible to have even if (for example when has linearly dependent columns), we see that the Gram matrix is not necessarily positive definite.
Explain why the rank of is equal to the rank of . (Hint: consider the null spaces of and .)
Solution. If , then multiplying both sides by gives . Therefore, the null space of is a subset of the null space of .
Conversely, if , then we can multiply this equation on the left by to get
which in turn implies that . A vector has zero norm only if it's the zero vector, so we conclude that .
Since and have the same null space dimension and have the same domain , they also have the same rank, by the rank-nullity theorem.
The Spectral Theorem
The eigenspace decomposition of a diagonalizable matrix is even easier to understand if the eigenvectors happen to be orthogonal. It turns out that this happens exactly when the matrix is symmetric:
Theorem (Spectral Theorem)
If is an symmetric matrix, then is orthogonally diagonalizable, meaning that has eigenvectors which are pairwise orthogonal.
Conversely, every orthogonally diagonalizable matrix is symmetric.
In other words, if is symmetric, then the one-dimensional subspaces along which is decomposed form a set of axes for which are orthogonal. In matrix terms, we have
for some orthogonal matrix .
Although it seems that the spectral theorem may be of limited use since so many matrices are not symmetric, we will see that we can associate any rectangular matrix with a symmetric square matrix that we can apply the spectral theorem to and use to extract insight about the original matrix. This beautiful idea is called the singular value decomposition and is the subject of the next section.
Given an invertible matrix , we are often interested in solving a system of the form . Our knowledge of is seldom perfect however, so it is important to consider what happens to the solution if we replace with a slightly different vector .
It is possible that a small change in leads to a substantial change in the vector .
- Find an invertible matrix all of whose entries are between and and a vector with entries between and and another vector whose components are nearly equal to those of for which and are not very close.
To be concrete, let's say "nearly equal" means "having ratio between 0.99 and 1.01", and let's say that "not very close" means "having a difference whose norm is greater than the norm of either". Find the eigenvalues of your matrix .
Solution. One simple way to do this is make and the columns of the matrix. For example,
solve(array([[1,1],[1, 1.01]]),[1,1]) returns
solve(array([[1,1],[1, 1.01]]),[1,1.01]) returns
Solution. One simple way to do this is make and the columns of the matrix. For example,
[1 1; 1 1.01] [1, 1] returns
[1, 0], while
[1 1; 1 1.01] [1, 1.01] returns
from numpy.linalg import solve from numpy import array
[1 1; 1 1.01] \ [1, 1]
The eigenvalues of the matrix
[1 1; 1 1.01] are approximately 0.005 and 2.005. In particular, the ratio of the eigenvalues is very large. You will find that the ratio of eigenvalues for your matrix is also large, because a matrix with a modest maximum eigenvalue ratio is backwards stable, meaning that small changes in do not lead to large changes in ,