Diagonalization

As matrices go, diagonal matrices are easy to work with. Sometimes this ease can be exploited with matrices that are not themselves diagonal.

Let A be a square, n×n matrix. If x is a scalar, then the corresponding eigenspace of A is the nullspace of the n×n matrix xI - A. Evidently the eigenspace is the trivial vector space unless det(xI - A) = 0.

The polynomial

det(xI - A)

(which is of degree n in the variable x) is the characteristic polynomial of A; its zeros are the eigenvalues of A. An eigenvector is a nonzero member of an eigenspace.

The square matrix A is diagonalizable if

AP = PD

for some invertible matrix P, where D is a diagonal matrix; note then A = PDP^-1.

Theorem. Let A be an n×n matrix. The following are equivalent:

A has n linearly independent eigenvectors.
A is diagonalizable.

To prove this theorem, suppose P is the matrix

(p₁ p₂ ... p_n),

where each column p_i is in Rⁿ. Then

AP = (Ap₁ Ap₂ ... Ap_n)

If D is a diagonal matrix, with diagonal entries x_i, then

D = (x₁e₁ x₂e₂ ... x_ne_n) = (x₁e₁ x₂e₂ ... x_ne_n)^T,

and therefore

PD = (x₁p₁ x₂p₂ ... x_np_n).

Therefore, AP = PD if and only if Ap_i = x_ip_i for each i from 1 to n inclusive.

Application to differential equations

As noted earlier, a linear polynomial is the result of applying to variables the operations of addition and scalar multiplication; these operations obey the same algebraic rules as they do in vector spaces. In a linear differential polynomial, the operation of differentiation may be applied as well: from a polynomial f, the operation produces a polynomial f' (read `eff-prime'), and it satisfies the following algebraic rules:

(af)' = af' when a is a scalar;
(f + g)' = f' + g'.

The differential equation y' = ay is equivalent to the homogeneous linear differential equation

y' - ay = 0 ;

from calculus its solution is known to be y = ce^ax (also written y = c exp(ax)). Here c is the value of y when x = 0; so we can write the solution thus:

y = y(0)e^ax .

Therefore the system of equations

y₁' - a₁y₁ = 0, y₂' - a₂y₂ = 0, ..., y_n' - a_ny_n = 0

has the solution

y₁ = y₁(0)e^(a₁x), y₂ = y₂(0)e^(a₂x), ..., y_n = y_n(0)e^(a_nx) .

We can write the last conclusion in matrix form:

The equation y' - Dy = 0 has the solution y = e^(xD)y(0), where

y is the vector (y₁ y₁ ... y_n)^T of variables;
D is the diagonal matrix whose diagonal entries are a₁, a₂, ..., a_n respectively;
e^(xD) is the diagonal matrix whose diagonal entries are e^(a₁x), e^(a₂x), ..., e^(a_nx) respectively.

More generally, we can form the differential system y' - Ay = 0, where A is an arbitrary n×n matrix. If this matrix is diagonalizable, say A = PDP^-1, then the system has the solution

y = Pe^(xD)P^-1y(0) .

If one is required to solve such a system, given numerical values for the entries of A, then one need not actually calculate the inverse of P unless the initial conditions y(0) are specified; even then, to find the vector P^-1y(0), one need only solve for c the equation

Pc = y(0) ,

and then the solution of the original system is y = Pe^(xD)c. (Even if the matrix A is not diagonalizable, the original system is still soluble, but by a more complicated procedure.)

Contents | next section