Philip J. Erdelsky

March 23, 2009

Please e-mail comments, corrections and additions to the webmaster at pje@efgh.com.

1. Definition

Let A be an nn matrix. Usually the entries in A are taken from a field, but many of the results also apply to matrices whose elements are taken from more general structures, such as commutative rings.

Let the element in the i-th row and the j-th column of A be represented in the usual way by ai,j. The determinant of A, which is usually written as det(A) or ∣A∣, is the sum of all n! products of the form

sp  ∏ ai,p(i) ,
where p is a permutation of {1,2,3,...,n} and sp is +1 if p is even and -1 if p is odd.

Each product contains exactly one element from each row and exactly one element from each column. The sum contains all such products.

For small matrices, we can compute the determinant directly:

2a11 a22 - a21 a12
3 a11 a22 a33 - a11 a23 a32 + a12 a23 a31 - a12 a21 a33 + a13 a21 a32 - a13 a22 a31

For every permutation p of {1,2,3,...,n},

1 + 2 + ... + n = p(1) + p(2) + ... + p(n)
because the right side contains the same addends, possibly in a different order. Hence if p(i) > i for some i, then p(j) < j for some j≠i. Hence if a product in the definition of det(A) contains an element above the main diagonal, it also contains an element below the main diagonal.

If a matrix is either upper triangular (ai,j=0 when i>j), lower triangular (ai,j=0 when i<j) or diagonal (ai,j=0 when i≠j), then the determinant reduces to a single term corresponding to the identity permutation, because every other term contains a zero factor:

det(A) = a1,1 a2,2 ... an,n when A is triangular or diagonal.

In particular, the determinant of the identity matrix is 1 and the determinant of the zero matrix is 0.

If a matrix contains a row of all zeros, or a column of all zeros, its determinant is zero, because every product in its definition must contain a zero factor.

2. Basic Properties

Theorem 2.1. The determinant of a matrix is equal to the determinant of its transpose.

Proof. The determinant of AT contains the following products:

sp  ∏ ap(i),i ,

Now let q be the inverse of the permutation p and use it to rearrange the factors:

sp  ∏ ai,q(i) ,

Since the parity of q is the same as the parity of p, sq = sp, so this becomes

sq  ∏ ai,q(i) ,

Hence the definition of det(AT) contains the same terms as the definition of det(A), possibly in a different order. █

Therefore, many results regarding the rows of a determinant also apply, mutatis mutandis, to its columns.

Theorem 2.2. The determinant of a matrix is a linear function of each of its rows (or columns).

Proof. Multiplication of a row by a constant factor introduces the same factor into every product in the definition of the determinant, so it also multiplies the determinant by the same factor.

Consider the first row. If A and B differ only in their first row, then the determinants of A and B can be written as sums of the following products, respectively:

sp  [a1,p(1) ∏ ai,p(i)]
sp  [b1,p(1) ∏ ai,p(i)]

Then we can add the sums term by term to get the corresponding term in the sum of the two determinants:

sp  [(a1,p(1)+b1,p(1)) ∏ ai,p(i)]

This is the determinant of the matrix obtained by adding the first row of A to the first row of B.

The proof for other rows is similar. █

Theorem 2.3. If any two rows (or columns) of a matrix are interchanged, the sign of its determinant is changed.

Proof. This changes the parity of each permutation in the definition of the determinant, so it changes the sign of each term, and hence the sign of the determinant. █

Theorem 2.4. If any two rows (or columns) of a matrix are proportional, its determinant is zero.

Proof. This is obvious if the proportionality factor is zero. In other cases, multiply one row by a constant factor so it becomes equal to the other. This multiplies the determinant by the same nonzero factor. The resulting determinant has two equal rows, so its determinant is unchanged if the rows are interchanged, but by Theorem 2.2 the sign of its determinant is changed. Hence its determinant, and also the determinant of the original matrix, must be zero. █

3. Elementary Row and Column Operations

Recall that multiplying one row of a matrix by a constant and adding the result to another row is called an elementary row operation. The first row is left unchanged and the second one is replaced by the result. In the following example, the first row is multiplied by 2 and added to the second row:

        |  1  2  3  |     |  1  2  3  |
        |  4  1  0  |     |  6  5  6  |
        |  7  8  9  |     |  7  8  9  |

We notice that the two matrices have the same determinant:

1*1*9 - 1*0*8 + 2*0*7 - 2*4*9 + 3*4*8 - 3*1*7 = 9-0+0-72+96-21 = 12
1*5*9 - 1*6*8 + 2*6*7 - 2*6*9 + 3*6*8 - 3*5*7 = 45-48+84-108+144-105 = 12

An analogous operation on columns is called an elementary column operation.

Theorem 3.1. An elementary row (or column) operation on a matrix does not change its determinant.

Proof. By Theorem 2.2 the determinant is a linear function of the row changed by the operation. Moreover, the operation adds the corresponding rows of two matrices, the original matrix and another with two proportional rows, whose determinant is zero. Hence the determinant is unchanged. █

This argument may be easier to follow by examining the example given above:

           |  1  2  3  |         |  1  2  3  |        |  1  2  3  |
       det(|  6  5  6  |)  = det(|  4  1  0  |) + det(|  2  4  6  |),
           |  7  8  9  |         |  7  8  9  |        |  7  8  9  |

The last determinant is zero because the first two rows are proportional.

Theorem 3.2. The determinant of a singular matrix is zero.

Proof. If a matrix is singular, then one of its rows is a linear combination of the others. By applying repeated elementary row operations we can make this row zero without changing the determinant. The determinant of the result is zero, and so was the determinant of the original matrix. █

4. Computing Determinants by Elimination

Elementary row operations can be used to compute determinants. But first, we need a specific result.

Lemma 4.1. If every element of the last column of an n ⨯ n matrix A is zero except ann, then its determinant is the product of ann and the determinant of the (n-1) ⨯ (n-1) submatrix consisting of the first n-1 rows and columns of A.

Proof. If we delete all products in the definition of det(A) that contain any of the zero elements, the remaining products all contain ann, and the associated permutations all keep n fixed. Hence each associated permutation is also a permutation of {1,2,...,n-1}, and its parity is the same as the permutation of {1,2,...,n}. We can factor out ann, and what remains is the definition of the determinant of the submatrix. █

The way to compute a determinant can be described recursively.

5. The Determinant of a Product

The following elegant result is rather hard to prove.

Theorem 5.1. If A and B are n ⨯ n matrices, then det(AB)=det(A)det(B).

Proof. For particular kinds of matrices, this result easy to prove. We will show that det(AB)=det(A)det(B), first for special matrices, and then for general matrices.

If A is singular, then det(A)=0. Since AB is also singular, det(AB)=0, which establishes the result if A is singular.

Now let A be a diagonal matrix. Then AB is simply the matrix B with each row multiplied by the diagonal element of A in the same row. By Theorem 2.2, det(AB) is the product of det(B) and all the diagonal elements of A. Since det(A) is the product of the diagonal elements of A, this establishes the result when A is diagonal.

Now consider the elementary row operation which multiplies the r-th row of B by c and adds it to the s-th row. It is easy to show, directly from the definition of matrix multiplication, that the result is AB, where A is the matrix with 1's along its principal diagonal, asr = c, and zeros elsewhere. Moreover, det(A) = 1 because it is a triangular matrix with all diagonal elements equal to 1. By Theorem 3.1, det(AB) = det(B). Hence det(AB) = det(A)det(B), which establishes the result when A is a matrix of this kind, which we shall call an elementary row operation matrix.

The inverse of an elementary row operation is an elementary row operation with the scalar -c.

Now suppose the matrix A is obtained by interchanging two rows of the identity matrix. Then det(A) = -1. Also, AB is the matrix B with the same two rows interchanged, so det(AB) = -det(B). This establishes the result when A is a matrix of this kind, which we shall call an interchange matrix.

An interchange matrix is its own inverse.

Now any nonsingular square matrix can be converted to a diagonal matrix by some sequence of elementary row operations and row interchanges as follows. This process is vacuous for 1 ⨯ 1 matrices; for larger matrices we use induction on the matrix size.

Since the matrix is nonsingular, the first column contains at least one nonzero element. Exchange two rows, if necessary, to bring a nonzero element to the upper left corner.

Subtract appropriate multiples of the first row from each other row to make all elements of the first row zero, except the element in the upper left corner.

The submatrix obtained by deleting the first row and the first column must be nonsingular. If its rows were dependent, then the last rows of the whole matrix would be dependent in the same way.

Apply the same procedure to the submatrix to diagonalize it. This involves elementary row operations and row exchanges which do not affect the first row.

Finally, subtract appropriate multiples of every other row from the first row to make all elements zero, except the element in the upper left corner.

The resulting matrix is diagonal.

We are now ready to prove that det(AB) = det(A)det(B) for a general nonsingular matrix A.

There is some sequence of elementary row operations and row interchanges which makes A into a diagonal matrix. We can perform these operations in reverse to convert the diagonal matrix back to A. Let R1, R2, R3,..., Rm be the matrices of these reverse operations. Then

R1 R2 R3 ... Rm D = A,
where D is a diagonal matrix. Repeated application of the special-case results shows that
det(R1) det(R2) det(R3) ... det(Rm) det(D) = det(A).


R1 R2 R3 ... Rm D B = AB,
det(R1) det(R2) det(R3) ... det(Rm) det(D) det(B) = det(AB),
Hence det(A)det(B) = det(AB), which establishes the result when A is any nonsingular matrix. █

The result can be extended to any finite number of matrices; e.g., det(ABC) = det(A)det(B)det(C).

The result is also valid for matrices over any commutative ring, although this proof depends on an elimination process, which is not always possible in commutative rings. In this case, consider det(AB) - det(A)det(B) as a function of 2 N 2 independent variables, which are the entries in A and B. It is a polynomial whose value is zero for all values of the independent variables in any field. For infinite fields, such as the field of real numbers, this is possible only if all coefficients are zero when like terms are combined. Hence the polynomial is also identically zero in any commutative ring.

Theorem 5.2. The determinant of a nonsingular matrix is nonzero, and det(A-1) = 1/det(A).

Proof. If A is nonsingular, then det(A-1)det(A) = det(A-1A) = det(I) = 1. █

Similar matrices have the same determinant, because det(Q-1AQ) = det(Q-1) det(A) det(Q) = (1/det(Q)) det(A) det(Q) = det(A). Since two distinct matrices are similar if, and only if, they are matrices of the same linear transformation with respect to different bases, it makes sense to talk of the determinant of a linear transformation, without regard to any particular basis or matrix.

Theorem 5.3. The determinant of the direct sum of two or more square matrices is the product of their determinants:

det(A1A2 ⊕ ... ⊕ Am) = det(A1) det(A2) ... det(Am).
(The matrices need not be the same size.)

Proof. It is fairly easy to show, from the definition, that the result holds for two matrices when one of them is the identity:

det(AI) = det(IA) = det(A).

The partition theorem can be used to multiply two partitioned matrices:

AB = (AIB) (IAB),
where IA and IB are identity matrices of the same sizes as A and B, respectively.


det(AB) = det(AIB) det(IAB) = det(A) det(B).

The extension to three or more matrices is straightforward. █

6. Cofactors and Minors

In the definition of det(A), where A is an n ⨯ n matrix, let us factor out the elements of the i-th row:

det(A) = Ai1 ai1 + Ai2 ai2 + ... + Ain ain   (6.1.1)

The coefficient Aij is called the cofactor of aij.

Since each product in the definition of det(A) contains exactly one element from each row and column, the cofactor Aij contains no elements from the i-th row or the j-th column. In fact, it is the determinant of a matrix obtained from A by replacing aij by 1 and all other elements of the i-th row and j-th column by zeros.

Theorem 6.1. The cofactor Aij is the product of (-1) i+j and the determinant of the matrix Mij obtained from A by deleting the i-th row and j-th column and pushing the remaining elements together to form an (n-1) ⨯ (n-1) matrix.

Proof. This is easiest to see when the element is a11. In this case, M11 consists of the last n-1 rows and the last n-1 columns, and it is not disturbed when a11 is replaced by 1 and other elements in the first row and column are replaced by zeros.

The determinant of the whole matrix is clearly the same as the determinant of M11, which proves the result for this case.

If the element aij is elsewhere in the matrix, then we can bring it to the first row without changing the relative positions of the other rows, by i row exchanges. First we exchange the i-th and the (i-1)-th row. Then we exchange the (i-1)-th and (i-2)-th row, and so on until we exchange the first and second rows.

Simlarly, we can perform j adjacent column exchanges to bring the element to the upper left corner.

All these exchanges have multiplied the determinant by (-1) i+j. The resulting determinant is obviously the determinant of Mij, whose rows and columns are now together. This proves the result in the general case. █

A matrix formed by deleting some rows and columns of A, such as Mi,j, is called a minor of A.

This theorem can be used recursively with (6.1.1) to compute determinants, but it is impractical for larger matrices because the number of operations increases exponentially with n.

As with most results for determinants, there is a similar result for the j-th column:

det(A) = A1j a1j + A2j a2j + ... + Anj anj   (6.1.2)

Now consider an expression similar to (6.1.1) with elements from the i-th row and cofactors from the k-th row. If i≠k, this is equivalent to (6.1.1) for a matrix with two equal rows. By Theorem 2.4, its determinant is zero. Hence

Ak1 ai1 + Ak2 ai2 + ... + Akn ain = 1 if i=k, 0 otherwise.

Let C be a matrix of the cofactors of A; i.e., cij = Aij. If A is nonsingular, the product A det(A) CT = I. Hence (1/det(A)) CT is the inverse of A.

If the elements of A are taken from a commutative ring, then A has an inverse if, and only if, det(A) has an inverse.

7. Cramer's Rule

Originally, determinants were applied, not to matrices as such, but to systems of n linear equations in n unknowns:

a11 x1 + a12 x2 + ... + a1n xn = b1
a21 x1 + a22 x2 + ... + a2n xn = b2
* * *
an1 x1 + an2 x2 + ... + ann xn = bn

This can be written in matrix form as Ax = b, where x and b are column vectors and A is the matrix of coefficients.

Clearly, if A is nonsingular, the system has a unique solution x = A-1b = (1/det(A))CTb, where C is the matrix of cofactors of A. Hence

xi = (b1 A1i + b2 A2i + ...+ bn Ani) / det(A).

The sum is the cofactor expansion of the determinant of a matrix obtained from A by replacing the i-th column with b. The result can be expressed rather elegantly by Cramer's Rule:

Theorem 7.1. If the determinant D of the matrix of coefficients of a system of n linear equations in n unknowns in a field is nonzero, then the system has a unique solution, and the value of the i-th unknown is the product of (1/D) and the determinant of the matrix obtained by replacing the coefficients of the i-th unknown by the constants.

Of course, the same result applies to a commutative ring, except that in this case the determinant of the coefficients must have an inverse.

Because of the large number of operations required, Cramer's rule is a practical way to solve linear equations only when the number of equations is small.