Question_P1

I’ll go into more detail than is probably expected here. Will be a long post with a lot of lemmas.

(a)

Let A be a diagonal matrix, ie. A_{ij} = 0 where i \ne j. We want to compute \det A.

We have:

\displaystyle \det A = \sum_{\sigma \in S_n} \left(\operatorname {sgn} \sigma \prod_{i = 1}^n A_{i \sigma(i)}\right)

where the sum runs over the symmetric group S_n (ie. the group of permutations on n elements) and \operatorname {sgn} \sigma is the sign of \sigma. (https://en.wikipedia.org/wiki/Parity_of_a_permutation)

Note that if any permutation \sigma has \sigma(i) \ne i for some i, then the corresponding term in the above sum will zero. (since we’d then have A_{i \sigma(i)} = 0) So the only non-zero term is that corresponding to the permutation \sigma that has \sigma(i) = i for all i. That is, the identity permutation. This has sign 1 (since it can be written trivially as the product of zero transpositions), so:

\displaystyle \det A = \prod_{i = 1}^n A_{ii}

this is the product of the diagonal of A.

Since A is invertible iff \det A \ne 0, A is invertible iff none of the elements of its diagonal are zero. A trivial example is the zero matrix.

We’ll now compute the product of arbitrary diagonal matrices. Let A, B be n \times n diagonal matrices. Then:

\displaystyle (AB)_{ij} = \sum_{k = 1}^n A_{ik} B_{kj}

Note that A_{ik} is non-zero only if i = k, so:

\displaystyle (AB)_{ij} = A_{ii} B_{ij}

since all other terms in the sum are zero. But note that if i \ne j, B_{ij} = 0 so (AB)_{ij} = 0. So AB too is diagonal and:

\displaystyle (AB)_{ij} = \begin{cases}A_{ii} B_{ii} & i = j \\ 0 & i \ne j\end{cases}

(this also means BA = AB for any two diagonal matrices)

With this we can compute the inverse of an invertible diagonal matrix. Let A be an invertible diagonal matrix. That is, A_{ij} \ne 0 iff i = j. Then the matrix A^{-1} with:

\displaystyle (A^{-1})_{ij} = \begin{cases}\frac 1 {A_{ii}} & i = j \\ 0 & i \ne j\end{cases}

has AA^{-1} = A^{-1}A = I by the product formula we just derived. So, it turns out the inverse of an invertible diagonal matrix is diagonal.

(alternatively, we could have used elementary row operations - which we will for upper triangular matrices)

(b)

Let A be an n \times n upper triangular matrix. Then:

\displaystyle \det A = \sum_{\sigma \in S_n} \left(\operatorname {sgn} \sigma \prod_{i = 1}^n A_{i \sigma(i)}\right)

Note that if i > \sigma(i) for any i, the term corresponding to \sigma will be zero, so we must have i \le \sigma(i) for any i. Note that this is only possible if \sigma(i) = i for all i, since we must have \sigma(n) = n because n \le \sigma(n) \le n, and similarly n - 1 \le \sigma(n - 1) so \sigma(n - 1) = n - 1 (since \sigma must be bijective). Working back like this gives \sigma(i) = i for all i.

So again:

\displaystyle \det A = \prod_{i = 1}^n A_{ii}

So again A is invertible iff none of the elements in its diagonal are zero.

To find the inverse of an upper triangular matrix, we will build the matrix up from the identity using elementary row operations. Each of these can be expressed with elementary matrices, which all happen to be upper triangular with upper triangular inverses, meaning that the inverse of an upper triangular matrix is upper triangular. (since the product of upper triangular matrices is upper triangular, as seen in P2) It’ll get a bit messy giving exact calculations, but I’ll make sure the gist is given.

We will introduce the elementary matrices:

  • The matrix E_1 (i, j), the identity matrix with the i th and j th row swapped represents the elementary row operation of interchanging the i th and j th rows. We will not be using this.
  • The matrix E_2 (i, \lambda), the identity matrix with the (i, i) th element changed to be \lambda represents the elementary row operation of multiplying the i th row by the scalar \lambda. Given \lambda \ne 0, this is an invertible matrix with (E_2 (i, \lambda))^{-1} = E_2(i, 1/\lambda).
  • The matrix E_3 (i, j, \lambda), the identity matrix with the (j, i) th element changed to be \lambda represents the elementary row operation of adding \lambda times the row i to the row j. This has inverse E_3 (i, j, -\lambda).

We will use E_2 and E_3 to “build up” A from I. The n th row of A has only one non-zero element. We “correct” this from I by multiplying the n th row by A_{nn} using E_2 (n, A_{nn}). We then correct the (n - 1) th row by multiplying the n - 1 th row by a scalar and adding a multiple of the n th row. Similarly, we correct the (n - 2) th row by multiplying it by a scalar, then adding suitable multiples of the (n - 1) th and n th rows. Repeat this until the matrix matches A.

Note that each of the elementary matrices here we’re using are upper triangular. E_2 (i, \lambda) is a diagonal matrix so is clearly upper triangular, and has a diagonal (hence upper triangular) inverse. When we use E_3 (i, j, \lambda) we’re only ever adding on multiples of lower rows in the matrix, so it’s always the case that i > j. Since only the (j, i) th element of the matrix differs from the identity, this matrix is therefore upper triangular. And since (E_3 (i, j, \lambda))^{-1} = E_3 (i, j, -\lambda) its inverse is upper triangular.

With this all, we have established that we can write:

\displaystyle A = \prod_{i = 1}^k X_i

for some number k \left(=\frac {n(n + 1)} 2\right) of upper triangular matrices X_i and:

\displaystyle A^{-1} = \prod_{i = 1}^k X^{-1}_{k + 1 - i}

with each of the X^{-1}_{k + 1 - i} s being upper triangular. Since the product of upper triangular matrices is upper triangular, A^{-1} too is upper triangular.

[please let me know if this explanation isn’t clear enough, might be a bit ropey]

1 Like

Hahaa this is amazing!!! We’re going to build such an incredible resource!!!