Matrix algebra

Linear combination

[Etymological note: The plural of matrix is matrices. In Latin, the root of the word is matric-; one gets the singular by adding -s and changing cs to x; the plural by adding -es. In case you were wondering.]

Matrices of the same size can be multiplied by scalars and added. In other words, we can take linear combinations of matrices of the same size. For example, if A and B are the m×n matrices (a_ij) and (b_ij) respectively, and cx + dy is a linear polynomial, then cA + dB is the m×n matrix

(ca_ij + db_ij).

With regard to addition and multiplication by scalars, matrices behave like scalars themselves. For example, among m×n matrices, there is a zero-matrix, the matrix whose every entry is 0. Addition by the zero-matrix has no effect, and for every matrix A, there is a matrix -A, namely (-1)A, such that A + -A is the zero-matrix.

Multiplication

Matrices of the appropriate sizes can be multiplied together. If A is an m×r matrix (a_ik), and B is an r×n matrix (b_kj), then AB is the m×n matrix

(a_i1b_1j + a_i2b_2j + ... + a_irb_rj).

Note, from this paradigm, the respect in which multiplication of matrices is not like multiplication of scalars:

the product BA is not defined, unless m = n;
if m = n, then BA is not the same size as AB, unless m = r = n;
even if m = r = n, the products BA and AB may not be equal.

In short, the order of matrices in a product matters. On the other hand, multiplication of matrices is like multiplication of scalars in other respects:

The order in which products are taken does not matter: if the product A(BC) is defined, then so is (AB)C, and the two resulting matrices are equal.
Multiplication distributes over addition: if A(B + C) is defined, then so is AB + AC, and these two matrices are equal.
For every n, there is an n×n matrix called the identity matrix and denoted I_n or I; multiplication by this, when defined, has no effect: AI = A and IB = B. The matrix I is (e_ij), where e_ij is 1 if i = j, and otherwise is 0.

Finally, if AB is defined, and k is a scalar, then k(AB) = (kA)B.

Note in particular that for a square(n×n) matrix A, if r is a non-negative integer, then the power

A^r

is well-defined. In particular, A⁰ = I. So if p is a polynomial in one variable, say

p(x) = a₀ + a₁x + a₂x² + ... + a_nxⁿ,

then p(A) is defined, and p(A) = a₀I + a₁A + a₂A² + ... + a_nAⁿ.

Transposition

If A is an m×n matrix (a_ij)_ij, then the transpose of A is the n×m matrix

(a_ij)_ji ;

it is denoted A^T. The transpose of a product is the product of the transposes, in reverse order:

(AB)^T = B^TA^T.

[The transpose of A might also be denoted (a_ji)_ij; a disadvantage of this notation is that here, i ranges between 1 and n, whereas in the notation for A itself, i ranges between 1 and m.]

Linear systems reconsidered

A matrix with a single row or column is called a row- or a column-matrix respectively; we shall also call it a vector. A column-vector is the transpose of a row-vector, and to write it that way is typographically convenient. So,

(x₁ x₂ ... x_n)^T

is a column-vector. Denote this vector by x. If also b is the column-vector (b₁ b₂ ... b_m)^T, and A is the m×n matrix (a_ij), then the general linear system given earlier can be written thus:

Ax = b.

Note the following way to interpret the matrix-product Ax. We can write A as

(a₁ a₂ ... a_n),

where each a_j is the column-vector (a_1j a_2j ... a_mj)^T; then the product Ax is just

x₁a₁ + x₂a₂ + ... + x_na_n.

In other words, Ax is a linear combination of the columns of A. We may say:

The linear system Ax = b is consistent if and only if b is a linear combination of the columns of A.

Inversion

With square, n×n matrices A, multiplication on either side by the identity-matrix I has no effect:

IA = AI = A.

If there is a matrix B such that

AB = BA = I,

then B is the (multiplicative) inverse of A, denoted A^-1. As the notation suggests, inverses are unique: A has at most one inverse. It might not have an inverse at all. A matrix with an inverse is invertible. The inverse of a product is the product of the inverses, in reverse order, if the product has an inverse:

(AB)^-1 = B^-1A^-1.

There is a formula for the inverse of a matrix, but it is impractical, except for 2×2 matrices. We shall develop a technique for finding inverses.

Suppose A is invertible. Then the linear system Ax = b has a unique solution, namely A^-1b. Therefore, the elementary row-operations reduce A to a matrix with a pivot in every column, also in every row, since A is square; hence A reduces to the identity, I.

The result of applying an elementary row-operation to I is an elementary matrix, say E. Then the result of applying the same operation to A is the product EA. Thus, if A is invertible, then there is a sequence of elementary matrices E_i such that

E_r...E₂E₁A = I.

Note that elementary matrices are invertible, so the equation gives

A = E₁^-1E₂^-1...E_r^-1;

from this equation follows

A^-1 = E_r...E₂E₁.

Now we have our technique: if we want to know whether a square matrix A is invertible, we set I next to A, forming an n×2n matrix (A I). Then A is invertible if and only if this matrix reduces to (I A^-1).

We have sketched out an argument that a square matrix A is invertible if and only if a linear system Ax = b has a unique solution. A consequence is this, that A is invertible, provided BA = I for some matrix B, which then is A^-1.

The transpose of an invertible matrix is invertible, and the inverse of the transpose is the transpose of the inverse:

(A^T)^-1 = (A^-1)^T.

Special matrices

Let A be a square matrix, say A = (a_ij). Then A is:

upper-triangular, if a_ij = 0 when j < i;
lower-triangular, if A^T is upper-triangular;
diagonal, if it is both upper- and lower-triangular;
symmetric, if A = A^T.

The diagonal entries of A are the a_ij such that i = j. A triangular matrix is invertible if and only if each of its diagonal entries is non-zero.

A linear combination of upper-triangular matrices is upper-triangular. So is a product of upper-triangular matrices.

A linear combination of symmetric matrices is symmetric. If A and B are two symmetric matrices, then their product AB is symmetric if and only if A and B commute (that is, AB = BA.)

For any matrix A (not just a square matrix), the product A^TA is symmetric.

Suppose D is a diagonal matrix, with diagonal entries d₁, d₂, ..., d_n. Suppose A is an m×n matrix (a₁ a₂ ... a_n). Then

AD = (d₁a₁ d₂a₂ ... d_na_n).

(Next part: Determinants.)