Matrix algebra
Linear combination
[Etymological note: The plural of matrix is matrices. In Latin, the root of the word is matric-; one gets the singular by adding -s and changing cs to x; the plural by adding -es. In case you were wondering.]Matrices of the same size can be multiplied by scalars and added. In other words, we can take linear combinations of matrices of the same size. For example, if A and B are the m×n matrices (aij) and (bij) respectively, and cx + dy is a linear polynomial, then cA + dB is the m×n matrix
(caij + dbij).
With regard to addition and multiplication by scalars, matrices behave like scalars themselves. For example, among m×n matrices, there is a zero-matrix, the matrix whose every entry is 0. Addition by the zero-matrix has no effect, and for every matrix A, there is a matrix -A, namely (-1)A, such that A + -A is the zero-matrix.
Multiplication
Matrices of the appropriate sizes can be multiplied together. If A is an m×r matrix (aik), and B is an r×n matrix (bkj), then AB is the m×n matrix(ai1b1j + ai2b2j + ... + airbrj).
Note, from this paradigm, the respect in which multiplication of matrices is not like multiplication of scalars:
- the product BA is not defined, unless m = n;
- if m = n, then BA is not the same size as AB, unless m = r = n;
- even if m = r = n, the products BA and AB may not be equal.
In short, the order of matrices in a product matters. On the other hand, multiplication of matrices is like multiplication of scalars in other respects:
- The order in which products are taken does not matter: if the product A(BC) is defined, then so is (AB)C, and the two resulting matrices are equal.
- Multiplication distributes over addition: if A(B + C) is defined, then so is AB + AC, and these two matrices are equal.
- For every n, there is an n×n matrix called the identity matrix and denoted by In or I; multiplication by this, when defined, has no effect: AI = A and IB = B. The matrix I is (eij), where eij is 1 if i = j, and otherwise is 0.
Note in particular that for a square (n×n) matrix A, if r is a non-negative integer, then the power
Ar
is well-defined. In particular, A0 = I. So if p is a polynomial in one variable, say
p(x) = a0 + a1x + a2x2 + ... + anxn,
then p(A) is defined, and p(A) = a0I + a1A + a2A2 + ... + anAn.
Transposition
If A is an m×n matrix (aij)ij, then the transpose of A is the n×m matrix(aij)ji ;
it is denoted by AT. The transpose of a product is the product of the transposes, in reverse order:
(AB)T = BTAT.
[The transpose of A might also be denoted by (aji)ij; a disadvantage of this notation is that here, i ranges between 1 and n, whereas in the notation for A itself, i ranges between 1 and m.]
Linear systems reconsidered
A matrix with a single row or column is called a row- or a column-matrix respectively; we shall also call it a vector. A column-vector is the transpose of a row-vector, and to write it that way is typographically convenient. So,(x1 x2 ... xn)T
is a column-vector. Denote this vector by x. If also b is the column-vector (b1 b2 ... bm)T, and A is the m×n matrix (aij), then the general linear system given earlier can be written thus:
Ax = b.
Note the following way to interpret the matrix-product Ax. We can write A as
(a1 a2 ... an),
where each aj is the column-vector (a1j a2j ... amj)T; then the product Ax is just
x1a1 + x2a2 + ... + xnan.
In other words, Ax is a linear combination of the columns of A. We may say:
Inversion
With square, n×n matrices A, multiplication on either side by the identity-matrix I has no effect:
IA = AI = A.
If there is a matrix B such that
AB = BA = I,
then B is the (multiplicative) inverse of A, denoted by A-1. As the notation suggests, inverses are unique: A has at most one inverse. It might not have an inverse at all. A matrix with an inverse is invertible. The inverse of a product is the product of the inverses, in reverse order, if the product has an inverse:
(AB)-1 = B-1A-1.
There is a formula for the inverse of a matrix, but it is impractical, except for 2×2 matrices. We shall develop a technique for finding inverses.
Suppose A is invertible. Then the linear system Ax = b has a unique solution, namely A-1b. Therefore, the elementary row-operations reduce A to a matrix with a pivot in every column, also in every row, since A is square; hence A reduces to the identity, I.
The result of applying an elementary row-operation to I is an elementary matrix, say E. Then the result of applying the same operation to A is the product EA. Thus, if A is invertible, then there is a sequence of elementary matrices Ei such that
Er...E2E1A = I.
Note that elementary matrices are invertible, so the equation gives
A = E1-1E2-1...Er-1;
from this equation follows
A-1 = Er...E2E1.
Now we have our technique: if we want to know whether a square matrix A is invertible, we set I next to A, forming an n×2n matrix (A I). Then A is invertible if and only if this matrix reduces to (I A-1).
We have sketched out an argument that a square matrix A is invertible if and only if a linear system Ax = b has a unique solution. A consequence is this, that A is invertible, provided BA = I for some matrix B, which then is A-1.
The transpose of an invertible matrix is invertible, and the inverse of the transpose is the transpose of the inverse:
(AT)-1 = (A-1)T.
Special matrices
Let A be a square matrix, say A = (aij). Then A is:
- upper-triangular, if aij = 0 when j < i;
- lower-triangular, if AT is upper-triangular;
- diagonal, if it is both upper- and lower-triangular;
- symmetric, if A = AT.
The diagonal entries of A are the aij such that i = j. A triangular matrix is invertible if and only if each of its diagonal entries is non-zero.
A linear combination of upper-triangular matrices is upper-triangular. So is a product of upper-triangular matrices.
A linear combination of symmetric matrices is symmetric. If A and B are two symmetric matrices, then their product AB is symmetric if and only if A and B commute (that is, AB = BA).
For any matrix A (not just a square matrix), the product ATA is symmetric.
Suppose D is a diagonal matrix, with diagonal entries d1, d2, ..., dn. Suppose A is an m×n matrix (a1 a2 ... an). Then
AD = (d1a1 d2a2 ... dnan).
Next part: Determinants