I’ve been reading through Goodfellow’s Deep Learning Book, which starts with a review of the applied mathematics and machine learning basics necessary to understand deep learning. While reading through the linear algebra chapter, I realized that I had become a bit rusty working with eigenvalues and matrix decomposition, so I’ve decided to explain them here! It’s been a great exercise to review these topics myself and I hope that you find them helpful as well.
Eigendecomposition
We are often concerned with breaking mathematical objects down into smaller pieces in order to gain a better understanding of its characteristics. A classic example of this is decomposing an integer into its prime factors. For example,
An eigenvector of a square matrix
The scalar
Finding eigenvalues
With a bit of algebraic manipulation, we can see that
This form of the equation is useful to us because it is known that if a matrix is non-invertible, then the determinant of that matrix must equal zero. Hence, if we solve the equation
(this is called the characteristic equation) for
Deriving the eigendecomposition matrix
We can decompose the matrix
Now we will show how we can derive this equation. Suppose that we have an
We now define the
With all of these pieces, we can see how to decompose the matrix.
This decomposition allows us to analyze certain properties of the matrix. For example, we can conclude that the matrix is singular if and only if any of the eigenvalues are zero. The benefits of this kind of decomposition, however, are limited. The glaring issue is that the eigendecomposition of a matrix is only defined if the matrix is square. For this reason, in practice, we usually resort to singular value decomposition instead, which is defined on all real matrices and gives us the same kind of information about the matrix.
Resources
In the coming months, I hope to learn more about the applications of matrix decomposition in the context of Deep Learning. My understanding is that we often desire to decompose weights matrices associated with neural networks to analyze how a model is learning. This paper delineates the usefulness of SVD in practice. Additionally, Charles Martin writes on the analysis of weights matrix eigenvalues in this accessible article. Finally, if you are looking for a different perspective matrix decomposition, I recommend hadrienj’s series of articles on linear algebra for Deep Learning which follow along with Goodfellow’s book.