3Matrices

Handling linear equations and keeping track of the unknowns can be a pain. At a certain point one needs to simplify the notation. This is done introducing matrices.

For example, the system of equations

$\begin{matrix} && &2y &+ &4z &= &-2\\ &3x &+ &2y &+ &7z &= &4 \end{matrix} \tag{3.1}$ can be represented by the rectangular array (matrix)

$\begin{pmatrix} 0 & 2 & 4 & -2\\ 3 & 2 & 7 & 4 \end{pmatrix} \tag{3.2}$ of numbers. Many of the operations we do to solve linear equations might as well be done on this array forgetting about the unknowns.

3.1 Matrices

3.1.1 Definitions

A rectangular array of numbers is called a matrix. A matrix with $m$ rows and $n$ columns is called an $m\times n$ ( $m$ by $n$ ) matrix. The notation for an $m\times n$ matrix $A$ is

$A = \begin{pmatrix} a_{11} & \cdots &a_{1j}& \cdots& a_{1 n} \\ \vdots & \ddots &\vdots & \ddots & \vdots\\ a_{i1} & \cdots &a_{ij}& \cdots& a_{i n} \\ \vdots & \ddots &\vdots & \ddots & \vdots\\ a_{m1} & \cdots &a_{mj}& \cdots& a_{m n} \end{pmatrix}, \tag{3.3}$ where $A_{ij} = a_{i j}$ denotes the number or entry in the $i$ -th row and $j$ -th column. If the matrix in (3.2) is denoted $A$ , then it has $2$ rows and $4$ columns with $A_{14} = -2$ .

Two matrices are equal if they have the same number of rows and columns and their entries are identical.

A very useful (and famous) open source library in python (with 1000+ contributors) for handling matrices is NumPy. Here is how the matrix in (3.2) is entered in NumPy.

A matrix whose entries are all $0$ is called a zero matrix. It is denoted simply by $0$ , when it is clear from the context what its numbers of rows and columns are.
A matrix is called quadratic if it has an equal number of rows and columns. The first two matrices below are quadratic, whereas the third is not.
$\begin{pmatrix} 1 \end{pmatrix}, \qquad \begin{pmatrix} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9\end{pmatrix}, \qquad \begin{pmatrix} 0 & 1 & 0\\ 1 & 0 & 1\end{pmatrix}.$
The diagonal in a matrix is defined as the entries in the matrix with the same row- and column indices. Below we have a $3\times 4$ matrix with the diagonal elements marked
$\begin{pmatrix} \color{red}{1} & 3 & 0 & 1\\ 3 & \color{red}{2} & 1 & 5\\ 1 & 0 & \color{red}{3} & 6 \end{pmatrix}.$ A matrix is called a diagonal matrix, if all its entries outside the diagonal are $=0$ . Below is an example of a square diagonal matrix
$\begin{pmatrix} 1 & 0 & 0\\ 0 & 2 & 0\\ 0 & 0 & 3 \end{pmatrix}.$
A matrix is called a row vector if it has only one row. For example,
$\begin{pmatrix} 1 & 2 & 3 \end{pmatrix}$ is a row vector with three columns.
A matrix is called a column vector if it has only one column. For example,
$\begin{pmatrix} 1\\ 2 \\ 3 \end{pmatrix}$ is a column vector with three rows.
The rows in a matrix are called the row vectors of the matrix. The $i$ -th row in a matrix $A$ is denoted $A_i$ . The matrix $A$ in (3.2) contains the row vectors
$A_1 = \begin{pmatrix} 0 & 2 & 4 & -2 \end{pmatrix} \qquad\text{and}\qquad A_2 = \begin{pmatrix} 3 & 2 & 7 & 4 \end{pmatrix}.$
The columns in a matrix are called the column vectors of the matrix. The $j$ -th column in a matrix $A$ is denoted . The matrix $A$ in (3.2) contains the column vectors
$A^1 = \begin{pmatrix} 0 \\ 3 \end{pmatrix},\quad A^2 =\begin{pmatrix} 2 \\ 2 \end{pmatrix},\quad A^3 =\begin{pmatrix} 4 \\ 7 \end{pmatrix}\quad\text{and}\quad A^4 = \begin{pmatrix} -2 \\ 4 \end{pmatrix}.$
A row- or column vector is referred to as a vector.
Even though we have used the notation $\mathbb{R}^n$ for the $n$ -th cartesian product of $\mathbb{R}$ , we will use $\mathbb{R}^n$ henceforth to denote the set of column vectors with $n$ rows (entries). This definition is almost identical with the previous one, except that the tuple is formatted as a column vector.
Illustrated by an example,
$\begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} \in \mathbb{R}^3\qquad\text{instead of}\qquad (1, 2, 3)\in \mathbb{R}^3.$

3.2 Linear maps

In the first chapter we encountered a miniature version of a neural network. Neural networks are generally incredibly complicated functions from $\mathbb{R}^n$ to $\mathbb{R}^m$ . The function $f:\mathbb{R}^2\rightarrow \mathbb{R}^2$ given by

$f\begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} x^7 y + \cos(x y) e^{x^2 + y^2 -1}\\ 2 x y^2 - \sin(x + y) (x^3 + y^3) \end{pmatrix},$ even though it looks complicated, is simple in comparison.

You probably agree that the function $g:\mathbb{R}^2\rightarrow \mathbb{R}^2$ given by

$g\begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} 2 x + 3 y\\ 3 x - 2 y \end{pmatrix}$ is even simpler. This function (or map) is an example of a linear map. In general, a linear map $f: \mathbb{R}^n\rightarrow \mathbb{R}^m$ has the form

$f\begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix} = \begin{pmatrix} a_{11} x_1 + \cdots + a_{1 n} x_n\\ \vdots \\ a_{m1} x_1 + \cdots + a_{m n} x_n \end{pmatrix},$ where $a_{11}, \dots, a_{mn}$ are $m n$ real numbers.

Using matrices we will use the notation

$\begin{pmatrix} a_{11} & \cdots & a_{1n}\\ \vdots & \ddots & \vdots\\ a_{m1} & \cdots & a_{mn} \end{pmatrix} \begin{pmatrix} x_1 \\ \vdots \\ x_n\end{pmatrix} = \begin{pmatrix} a_{11} x_1 + \cdots + a_{1 n} x_n\\ \vdots \\ a_{m1} x_1 + \cdots + a_{m n} x_n \end{pmatrix}.$

In this way, we can write the map $f$ as

$f(v) = A v,$ where $A$ is the $m\times n$ matrix

$\begin{pmatrix} a_{11} & \cdots & a_{1n}\\ \vdots & \ddots & \vdots\\ a_{m1} & \cdots & a_{mn} \end{pmatrix}$ and $v$ is the vector

$\begin{pmatrix} x_1 \\ \vdots \\ x_n\end{pmatrix}$ in $\mathbb{R}^n$ .

Basically a linear map is a system of linear equations without the right hand side (including $=$ ). In fact, we may write the system of linear equations in (3.1) as

$\begin{pmatrix} 0 & 2 & 4\\ 3 & 2 & 7 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} -2 \\ 4 \end{pmatrix}.$

Let $f: \mathbb{R}^2\rightarrow \mathbb{R}^2$ be the linear map given by the $2\times 2$ matrix

$\begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix}.$ Does there exist $u\in \mathbb{R}^2$ , such that

$f(u) = \begin{pmatrix} 3 \\ 7 \end{pmatrix}?$ Quite generally, can we find $u\in \mathbb{R}^2$ , such that

$f(u) = \begin{pmatrix} b_1 \\ b_2 \end{pmatrix}?$ for arbitrary $b_1, b_2\in \mathbb{R}$ ?

Suppose you know that $f: \mathbb{R}^n\rightarrow \mathbb{R}^m$ is a linear map and that you have a black box giving you output $f(v)\in \mathbb{R}^m$ if you supply the input $v\in \mathbb{R}^n$ . How would you find the matrix defining $f$ ?

3.3 Matrix multiplication

Suppose we are given two linear maps $f: \mathbb{R}^2\rightarrow \mathbb{R}^2$ and $g:\mathbb{R}^2\rightarrow \mathbb{R}^2$ . Then it turns out that the composition $f\circ g: \mathbb{R}^2\rightarrow \mathbb{R}^2$ is also a linear map. A word of advice: the computations below look large and intimidating. They are not. It is important that you carry them out on your own. Do not look and copy or tell yourself that it looks okay. Do the computations yourself and ask me or fellow students if you get stuck.

Let us look at an example. Suppose that

$g\begin{pmatrix} x\\ y \end{pmatrix} = \begin{pmatrix} 2 & 3 \\ -1 & -2 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}\qquad \text{and} \qquad f\begin{pmatrix} u\\ v \end{pmatrix} = \begin{pmatrix} 1 & 2 \\ 1 & -2 \end{pmatrix} \begin{pmatrix} u \\ v \end{pmatrix}.$ Then

$\begin{aligned} (f\circ g)\begin{pmatrix} x\\ y \end{pmatrix} &= f\left(g\begin{pmatrix} x\\ y \end{pmatrix}\right) = f\left(\begin{pmatrix} 2 & 3 \\ -1 & -2 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}\right) = \begin{pmatrix} 1 & 2 \\ 1 & -2 \end{pmatrix} \left( \begin{pmatrix} 2 & 3 \\ -1 & -2 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} \right)\\ \\ &= \begin{pmatrix} 1 & 2 \\ 1 & -2 \end{pmatrix} \begin{pmatrix} 2 x + 3 y\\ -x - 2y \end{pmatrix} = \begin{pmatrix} - y \\ 4 x + 7 y \end{pmatrix} = \begin{pmatrix} 0 & -1\\ 4 & 7 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}. \end{aligned}$

In terms of the matrices of the linear maps, we write this as

$\begin{pmatrix} 1 & 2\\ 1 & -2 \end{pmatrix} \begin{pmatrix} 2 & 3\\ -1 & -2 \end{pmatrix} = \begin{pmatrix} 0 & -1\\ 4 & 7 \end{pmatrix} \tag{3.4}$

There is nothing special about the numbers in this example. We might as well do the computation in general: suppose that

$g\begin{pmatrix} x\\ y \end{pmatrix} = \begin{pmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}\qquad \text{and} \qquad f\begin{pmatrix} u\\ v \end{pmatrix} = \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix} \begin{pmatrix} u \\ v \end{pmatrix}.$ Then

$\begin{aligned} (f\circ g)\begin{pmatrix} x\\ y \end{pmatrix} &= f\left(g\begin{pmatrix} x\\ y \end{pmatrix}\right) = f\left(\begin{pmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}\right) = \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix} \left( \begin{pmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} \right)\\ \\ &= \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix} \begin{pmatrix} b_{11} x + b_{12} y\\ b_{21} x + b_{22} y \end{pmatrix} = \begin{pmatrix} a_{11} (b_{11} x + b_{12} y) + a_{12} (b_{21} x + b_{22} y)\\ a_{21} (b_{11} x + b_{12} y) + a_{22} (b_{21} x + b_{22} y) \end{pmatrix}\\ \\ &= \begin{pmatrix} (a_{11} b_{11} + a_{12} b_{21}) x + (a_{11}b_{12} + a_{12} b_{22}) y \\ (a_{21} b_{11} + a_{22} b_{21}) x + (a_{21} b_{12} + a_{22} b_{22}) y \end{pmatrix}\\ \\ &= \begin{pmatrix} a_{11} b_{11} + a_{12} b_{21} & a_{11}b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} & a_{21} b_{12} + a_{22} b_{22} \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}. \end{aligned}$

Again, in terms of the matrices of the linear maps, we write this as

$\begin{pmatrix} a_{11} & a_{12}\\ \color{blue}{a_{21}} & \color{red}{a_{22}} \end{pmatrix} \begin{pmatrix} \color{blue}{b_{11}} & b_{12}\\ \color{red}{b_{21}} & b_{22} \end{pmatrix} = \begin{pmatrix} a_{11} b_{11} + a_{12} b_{21} & a_{11} b_{12} + a_{12} b_{22}\\ \color{blue}{a_{21} b_{11}} + \color{red}{a_{22} b_{21}} & a_{21} b_{12} + a_{22} b_{22} \end{pmatrix} \tag{3.5}$

The equation above is the formula for matrix multiplication for two $2\times 2$ matrices, precisely as it was introduced by Cayley around $1857$ .

Upon closer inspection (and colored in (3.5) for $i=2$ and $j= 1$ ), you will see that the number in the $i$ -th row and $j$ -th column in the product matrix is the row-column multiplication between the $i$ -th row and the $j$ -th column in the two matrices:

The row-column multiplication between a row vector

$x = (x_1 x_2 \dots x_n)\qquad\text{and a column vector}\qquad y = \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix}$ with the same number of entries is defined as

$x y = x_1 y_1 + x_2 y_2 + \cdots + x_n y_n.$

Let $A$ be an $\color{blue}{m}\times \color{brown}{n}$ matrix and $B$ an $\color{brown}{n}\times \color{red}{r}$ matrix. Then the matrix product $A B$ is defined as the $\color{blue}{m}\times\color{red}{r}$ matrix $C$ given by the row-column multiplication

$C_{ij} = A_i B^j = A_{i1} B_{1j} + A_{i2} B_{2j} + \cdots + A_{in} B_{nj}$ for $1\leq i \leq m$ and $1\leq j \leq r$ .

If $A$ is an $m\times n$ matrix and $B$ is an $r\times s$ , then the matrix product $A B$ only makes sense if $n = r$ : the number of columns in $A$ must equal the number of rows in $B$ .

Suppose that

$A = \begin{pmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \end{pmatrix}, \quad B = \begin{pmatrix} 1 & 0\\ 0 & 1\end{pmatrix}, \quad C = \begin{pmatrix} 1 & 1 & 1\end{pmatrix}, \quad\text{and}\quad D = \begin{pmatrix} 1 \\ 1 \\ 1\end{pmatrix}$ Which of the matrix products below make sense?

$B A$

$A B$

$C D$

$D C$

$C A$

$A D$

I have been told that my pronunciation of column in the video below is wrong. In the area of the US, where I got my PhD, people for some reason had this (Irish?) rare pronunciation.

Using matrix product notation, the system of linear equations in (3.1) can now be written as

$\begin{pmatrix} 0 & 2 & 4\\ 3 & 2 & 7 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} -2 \\ 4 \end{pmatrix}$ Here we multiply a $2\times 3$ with a $3\times 1$ matrix. The row-column multiplication gives the $2\times 1$ matrix

$\begin{pmatrix} 2 y + 4 z\\ 3 x + 2 y + 7 z \end{pmatrix}.$ This matrix must equal the $2\times 1$ matrix on the right hand side for (3.1) to be true. This is in agreement with our convention for writing linear maps in section 3.2.

Suppose that

$A = \begin{pmatrix} 1 & 2 & 3\\ 0 & 1 & 2\\ 3 & x & 1 \end{pmatrix}, \quad B = \begin{pmatrix} 1& 1 & 1\\ 2 & 2 & 2\\ 0 & 1 & 1 \end{pmatrix}\quad\text{and}\quad C = A B$ Which ones of the statements below are true?

$C_{12} = 9$

$C_{23} = 4$

If $C_{32} = 4$ , then $x = 0$ .

If $C_{31} = -1$ , then $x=-1$ .

3.3.1 Matrix multiplication in `numpy`

Matrix multiplication in numpy is represented by the function dot:

3.3.2 The identity matrix

The identity matrix $I_n$ of order $n$ is the $n\times n$ diagonal matrix with $1$ in the diagonal. Below is the identity matrix of order $5$ .

$\begin{pmatrix} 1 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 1 \end{pmatrix}$

The identity matrix $I_n$ has the crucial property that

$I_n A = A I_n = A \tag{3.6}$ for all $n\times n$ matrices $A$ .

Prove that the two identities in (3.6) are true for $n\times n$ matrices.

3.3.3 Examples of matrix multiplication

Matrix multiplication is omnipresent in mathematics. Below we give an example, which is a baby version of Google's famous page rank algorithm.

Suppose that $20$ % of the people living in the suburbs move to the big city and that $30$ % of the people living in the big city move to the suburbs per year.

Aiming for a model using probabilities, let us be a bit more precise.

If you live in the suburbs, the probability that you move to the big city is $0.2$ ,
If you live in the suburbs, the probability that you do not move is $0.8$ .
If you live in the big city the probability that you move to the suburbs is $0.3$ .
If you live in the big city the probability that you do not move is $0.7$ .

All of the above probabilities are per year and can be illustrated in the diagram below

We are interested in predicting, using this model, how many people live in the big city and the suburbs given that we know how many people live in the big city, $x_0$ and in the suburbs $y_0$ to begin with i.e., setting the time $t = 0$ (years).

How many people $x_1$ and $y_1$ live in the two places after the first year ( $t=1$ )?

The population of the big city will decrease by $30%$ , but there are newcomers amounting to $20%$ of the population in the suburbs. Therefore

$x_1 = 0.7 x_0 + 0.2 y_0.$ In the same way,

$y_1 = 0.3 x_0 + 0.8 y_0.$ Using matrix multiplication, these two equations can be written

$\begin{pmatrix} x_1 \\ y_1 \end{pmatrix} = \begin{pmatrix} 0.7 & 0.2\\ 0.3 & 0.8 \end{pmatrix} \begin{pmatrix} x_0 \\ y_0 \end{pmatrix}.$ For $t=2$ years, we can repeat the procedure and the result becomes

$\begin{aligned} \begin{pmatrix} x_2 \\ y_2 \end{pmatrix} &= \begin{pmatrix} 0.7 & 0.2\\ 0.3 & 0.8 \end{pmatrix} \begin{pmatrix} x_1 \\ y_1 \end{pmatrix} = \begin{pmatrix} 0.7 & 0.2\\ 0.3 & 0.8 \end{pmatrix} \left(\begin{pmatrix} 0.7 & 0.2\\ 0.3 & 0.8 \end{pmatrix} \begin{pmatrix} x_0 \\ y_0 \end{pmatrix}\right)\\ &= \left( \begin{pmatrix} 0.7 & 0.2\\ 0.3 & 0.8 \end{pmatrix} \begin{pmatrix} 0.7 & 0.2\\ 0.3 & 0.8 \end{pmatrix}\right) \begin{pmatrix} x_0 \\ y_0 \end{pmatrix} = P^2 \begin{pmatrix} x_0 \\ y_0 \end{pmatrix}, \end{aligned}\tag{3.7}$ where

$P=\begin{pmatrix} 0.7 & 0.2\\ 0.3 & 0.8 \end{pmatrix}. \tag{3.8}$ In general we have the formula

$\begin{pmatrix} x_n \\ y_n \end{pmatrix} = P^n \begin{pmatrix} x_0 \\ y_0 \end{pmatrix}, \tag{3.9}$

giving the distribution of the populations for $t = n$ years. Let us experiment a little:

$\begin{aligned} P^2 &= \begin{pmatrix} 0.55 & 0.3\\ 0.45 & 0.7 \end{pmatrix}\\ P^3 = P P^2 &= \begin{pmatrix} 0.475 & 0.35\\ 0.525 & 0.65 \end{pmatrix}\\ P^4 = P P^3 &= \begin{pmatrix} 0.4375 & 0.375\\ 0.5625 & 0.625 \end{pmatrix}\\ &\vdots\\ P^{15} &= \begin{pmatrix} 0.400018 & 0.399951\\ 0.599982 & 0.600012 \end{pmatrix}\\ P^{16} &= \begin{pmatrix} 0.400009 & 0.399994\\ 0.599991 & 0.600006 \end{pmatrix} \end{aligned}$

It seems that the distribution stabilizes around $40%$ living in the big city and $60%$ living in the suburbs of the original total population.

The matrix $P$ is an example of a stochastic $2\times 2$ matrix. In general, a square matrix is called a stochastic matrix if its entries are $\geq 0$ and the sum of the entries in its column vectors are $1$ .

A simple example of the page rank algorithm is given in Example 2.13. There you encountered the equations

$\begin{aligned} T_2 &= T_1 + \tfrac{1}{2} T_4\\ T_3 &= T_2\\ T_4 &= T_3\\ T_1 &= \tfrac{1}{2} T_4\\ T_1 + T_2 + T_3 + T_4 &= 1. \end{aligned}$

In terms of matrix multiplication the first four equations can be rewritten to

$\begin{pmatrix} 0 & 0 & 0 & \tfrac{1}{2} \\[5pt] 1 & 0 & 0 & \tfrac{1}{2}\\[5pt] 0 & 1 & 0 & 0 \\[5pt] 0 & 0 & 1 & 0 \end{pmatrix} \begin{pmatrix} T_1 \\[5pt] T_2 \\[5pt] T_3 \\[5pt] T_4 \end{pmatrix} = \begin{pmatrix} T_1 \\[5pt] T_2 \\[5pt] T_3 \\[5pt] T_4 \end{pmatrix}.$ Putting

$P = \begin{pmatrix} 0 & 0 & 0 & \tfrac{1}{2} \\[5pt] 1 & 0 & 0 & \tfrac{1}{2}\\[5pt] 0 & 1 & 0 & 0 \\[5pt] 0 & 0 & 1 & 0 \end{pmatrix}$ we get a stochastic matrix and may again iterate and compute $P, P^2, P^3, \dots$ .

Is there a connection between the entries of $P^N$ , where $N$ is very big and the solutions to the linear equations?

In the end of Example 3.8 (above) a stochastic matrix is defined. Show that the matrix product of two $n\times n$ stochastic matrices is a stochastic matrix.

Below is an example, where matrix multiplication occurs in networks.

Suppose we have five cities connected with roads as shown below

This network has a so called $5\times 5$ incidence matrix, where city $i$ is associated with the $i$ -th row and $i$ -th column. A $1$ in the matrix in the $(i, j)$ entry means that there is a road from city $i$ to city $j$ , whereas a $0$ means that city $i$ and city $j$ are not connected by a road:

$A = \begin{pmatrix} 0 & 1 & 1 & 0 & 0\\ 1 & 0 & 1 & 1 & 0\\ 1 & 1 & 0 & 1 & 1\\ 0 & 1 & 1 & 0 & 1\\ 0 & 0 & 1 & 1 & 0 \end{pmatrix}.$ Here

$A^2 = \begin{pmatrix} 2 & 1 & 1 & 2 & 1 \\ 1 & 3 & 2 & 1 & 2 \\ 1 & 2 & 4 & 2 & 1 \\ 2 & 1 & 2 & 3 & 1 \\ 1 & 2 & 1 & 1 & 2 \end{pmatrix}\quad\text{and}\quad A^3 = \begin{pmatrix} 2 & 5 & 6 & 3 & 3 \\ 5 & 4 & 7 & 7 & 3 \\ 6 & 7 & 6 & 7 & 6 \\ 3 & 7 & 7 & 4 & 5 \\ 3 & 3 & 6 & 5 & 2 \end{pmatrix}.$ What is the interpretation of $A^2, A^3$ and $A^n$ in general? It turns out that the entry $(i, j)$ in the matrix $A^n$ exactly is the number of paths of length $n$ from city $i$ to city $j$ .

For example, there are $3$ paths from city $1$ to city $5$ of length $3$ corresponding to the paths $1245, 1345, 1235$ . The $2$ paths from city $1$ to city $1$ of length $3$ are $1231, 1321$ and the $5$ paths of length $3$ from city $1$ to city $2$ are $1342, 1242, 1312, 1212, 1232$ .

A deeper explanation

Suppose that we have a network with $m$ cities and incidence matrix $A$ .

The general proof of the observations above in our special example, builds on the fact that a path of length $n$ from city $i$ to city $j$ has to end with a road from a neighboring city $k$ to $j$ . For every one of these neighboring cities, we may count the number of paths of length $n-1$ from city $i$ . If $A^{n-1}_{gh}$ is the number of paths of length $n-1$ from city $g$ to city $h$ , then matrix multiplication tells us that

$A^n_{i j} = A^{n-1}_{i 1} A_{1 j} + \cdots + A^{n-1}_{i m} A_{m j}$ This number is exactly the number of paths of length $n$ from city $i$ to city $j$ , since $A_{k j} = 1$ only when $k$ is a neighboring city to city $j$ (and $0$ otherwise).

3.4 Matrix arithmetic

Matrix multiplication is very different from ordinary multiplication of numbers: it is not commutative. Consider the matrices

$A= \begin{pmatrix} 0 & 1\\ 0 & 0 \end{pmatrix}\qquad\text{and}\qquad B = \begin{pmatrix} 0 & 0\\ 1 & 0 \end{pmatrix}.$ Then

$A B = \begin{pmatrix} 1 & 0 \\ 0 & 0\end{pmatrix}\qquad \text{and} \qquad B A = \begin{pmatrix} 0 & 0 \\ 0 & 1\end{pmatrix}$ i.e.., $A B \neq B A$ .

Addition of matrices is like ordinary addition, except that you add all the entries of the involved matrices.

3.4.1 Matrix addition

Addition of two matrices with the same number of rows and columns is defined below.

$\begin{pmatrix} a_{11} & \cdots & a_{1 n} \\ \vdots & \ddots & \vdots\\ a_{m1} & \cdots & a_{m n} \end{pmatrix} + \begin{pmatrix} b_{11} & \cdots& b_{1 n} \\ \vdots & \ddots & \vdots\\ b_{m1} & \cdots & b_{m n} \end{pmatrix} = \begin{pmatrix} a_{11} + b_{11} & \cdots & a_{1 n} + b_{1n}\\ \vdots & \ddots & \vdots\\ a_{m1}+b_{m1} & \cdots & a_{m n}+b_{mn} \end{pmatrix}.$

The zero matrix is the ( $m\times n$ ) matrix containing zero in all its entries. When its number of rows and columns are clear from the context it is simply denoted by $0$ . For $2\times 3$ matrices for example, we write

$0 = \begin{pmatrix} 0 & 0 & 0\\ 0 & 0 & 0 \end{pmatrix}.$

Given an example of a non-zero $2\times 2$ matrix, such that

$A^2 = 0.$

3.4.2 Multiplication of a number and a matrix

A matrix may be multiplied by a number $\lambda$ by multiplying each entry by the number:

$\lambda \begin{pmatrix} a_{11} & \cdots & a_{1 n} \\ \vdots & \ddots & \vdots\\ a_{m1} & \cdots & a_{m n} \end{pmatrix} = \begin{pmatrix} \lambda a_{11} & \cdots & \lambda a_{1 n} \\ \vdots & \ddots & \vdots\\ \lambda a_{m1} & \cdots & \lambda a_{m n} \end{pmatrix}.$

Does there exists a number $\lambda$ , such that

$\lambda \begin{pmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \end{pmatrix} + \begin{pmatrix} 0 & 0 & 0\\ 0 & 0 & 2 \end{pmatrix} = \begin{pmatrix} 2 & 4 & 6\\ 8 & 10 & 15 \end{pmatrix}?$

Let $A$ be a $2\times 2$ matrix, such that

$A B = B A,$ for every other $2\times 2$ matrix $B$ . Show that $A$ is a diagonal matrix of the form

$A = \begin{pmatrix} a & 0\\ 0 & a \end{pmatrix},$ where $a\in \mathbb{R}$ i.e., $A = a I_2$ .

3.4.3 The distributive law

Ordinary numbers $a, b, c$ satisfy $a (b + c) = a b + a c$ . This rule also holds for matrices and is called the distributive law (multiplication is distributed over plus)

Let $B$ and $C$ be $m\times n$ matrices, $A$ an $r\times m$ matrix and $D$ an $n\times s$ matrix. Then

$A ( B + C) = A B + A C\qquad\text{and}\qquad (B + C) D = B D + C D.$

Let us start by looking at $A(B+C) = A B + A C$ . Here it suffices to do the proof, when $A$ is a row vector and $B, C$ column vectors, since

$(A (B+C))_{ij} = A_i (B+C)^j = A_i (B^j + C^j).$ For $(B + C) D = B D + C D$ , we may reduce to the case, where $B, C$ are row vectors and $D$ a column vector, since

$((B+C) D)_{ij} = (B+C)_i D^j = (B_i + C_i) D^j.$ Both of these cases follow using the distributive law for ordinary numbers.

Suppose that $A$ and $B$ are two $2\times 2$ matrices. Is it true that

$(A + B)^2 = A^2 + B^2 + 2 A B?$ What about

$(A + B) (A - B) = A^2 - B^2?$

3.4.4 The miraculous associative law

It does not make sense to multiply three matrices $A, B$ and $C$ . We have only defined matrix multiplication for two matrices. There are two natural ways of evaluating $A B C$ :

$( A B ) C\qquad \text{and}\qquad A (B C).$

We can begin by multiplying $A$ by $B$ and then multiply $C$ from the right. However, we may just as well start by multiplying $B$ by $C$ and then multiply $A$ from the left.

It is in no way clear, that these two computations give the same result!

That this turns out to be true, is just one of many miracles in the universe (there is a rather cool mathematical explanation, though, addressed in an exercise below).

Let $A$ be an $m\times n$ matrix, $B$ an $n\times r$ matrix and $C$ an $r\times s$ matrix. Then

$(A B) C = A (B C).$

We must prove that

$((A B) C)_{ij} = (A (B C))_{ij}$ for $1\leq i \leq m$ og $1\leq j \leq s$ . The left hand side can be written

$\begin{aligned} (A B)_i C^j &= (A_i B^1, \dots, A_i B^r) C^j\\ &= (A_i B^1) C_{1j} + (A_i B^2) C_{2j} + \cdots + (A_i B^r) C_{rj}. \end{aligned}\tag{3.10}$ The right hand side is

$A_i (B C)^j = A_i \begin{pmatrix} B_1 C^j \\ \vdots \\ B_n C^j\end{pmatrix} = A_{i1} (B_1 C^j) + \cdots + A_{in} (B_n C^j). \tag{3.11}$

Writing the row-column multiplications in (3.10), we get

$\begin{aligned} &A_{i1} B_{11} C_{1j} + \cdots + A_{in} B_{n1} C_{1j} +\\ &A_{i1} B_{12} C_{2j} + \cdots + A_{in} B_{n2} C_{2j} +\\ &\vdots\\ &A_{i1} B_{1r} C_{rj} + \cdots + A_{in} B_{nr} C_{rj}. \end{aligned}\tag{3.12}$ Writing the row-column multiplications in (3.11), we get

$\begin{aligned} &A_{i1} B_{11} C_{1j} + \cdots + A_{i1} B_{1r} C_{rj} +\\ &A_{i2} B_{21} C_{1j} + \cdots + A_{i2} B_{2r} C_{rj} +\\ &\vdots\\ &A_{in} B_{n1} C_{1j} + \cdots + A_{in} B_{nr} C_{rj}. \end{aligned}\tag{3.13}$ The rows in the sum in (3.12) correspond to the columns in the sum (3.13). Therefore these sums are equal and $((A B) C)_{ij} = (A (B C))_{ij}$ .

The associative law $(A B) C = A (B C)$ is true, but in computing $A B C$ there can be a (big) difference in the number of multiplications in the two computations $A (B C)$ and $(A B) C$ i.e., efficiency is not associative for matrix multiplication. In the notation of Theorem 3.17, computing $(A B) C$ requires

$m n r + m r s = m r (n + s)$ multiplications, whereas computing $A (B C)$ requires

$n r s + m n s = n s (m + r)$ multiplications. If for example $m=10000, n = 10, r = 10000$ and $s = 10$ , then computing $(A B) C$ requires $2\cdot 10^9$ multiplications, whereas computing $A (B C)$ requires $2\cdot 10^6$ multiplications!

Verify the associative law for the three matrices

$A = \begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix}, \qquad B = \begin{pmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \end{pmatrix}\qquad\text{and}\qquad C = \begin{pmatrix} 6 & 3\\ 5 & 2\\ 4 & 1 \end{pmatrix}$ by showing by explicit computation that

$(A B) C = A (B C).$

There is in fact a high tech explanation that the associative law for matrices holds. An explanation that makes the calculations in the above proof superfluous and shows the raw power of abstract mathematics: suppose that $f: \mathbb{R}^n\rightarrow \mathbb{R}^m, g:\mathbb{R}^r\rightarrow \mathbb{R}^n$ and $h:\mathbb{R}^s\rightarrow \mathbb{R}^r$ are linear maps. Then $f\circ (g\circ h)$ and $(f\circ g)\circ h$ are both linear maps from $\mathbb{R}^s\rightarrow \mathbb{R}^m$ , such that

$(f\circ (g\circ h))(x) = ((f\circ g)\circ h)(x) = f( g ( h (x)))$ for every $x\in \mathbb{R}^s$ . How does this relate to the associative law for matrix multiplication?

3.5 The inverse matrix

You are allowed to divide by a number provided it is $\neq 0$ . Does it makes sense to divide by matrices?

It does, but there are some matrices that correspond to the number $0$ that we are not allowed to divide by.

Let $A, B$ and $C$ be $n\times n$ matrices. Show that

$B A = I_n$ and

$A C = I_n$ implies that $B = C$ .

An $n\times n$ matrix $A$ is called invertible, if there exists an $n\times n$ matrix $B$ , such that

$A B = B A = I_n.$ In this case, $B$ is called the inverse matrix of $A$ and denoted $A^{-1}$ .

As a reality check, you should convince yourself that the $2\times 2$ matrix

$\begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix}$ is not invertible. In fact, here the associative law from Theorem 3.17 is incredibly useful: if $A$ is an invertible matrix with inverse matrix $B$ and $A C = 0$ , then $C = 0$ :

$A C = 0 \implies B( A C) = B 0 = 0 \implies (B A) C = I_n C = C = 0.$

Show that a quadratic matrix with a column or row consisting entirely of zeros cannot be invertible.

Suppose that

$A = \begin{pmatrix} a & b\\ c & d \end{pmatrix}$ with $D = a d - b c\neq 0$ . Prove that $A$ is invertible with

$A^{-1} = \frac{1}{D}\begin{pmatrix} d & -b\\ -c & a \end{pmatrix}.$

When is a quadratic diagonal matrix invertible? Look first at the $2\times 2$ case:

$\begin{pmatrix} a & 0\\ 0 & d \end{pmatrix}.$

The inverse matrix can be computed in numpy:

The inverse matrix enters the picture when solving $n$ linear equations with $n$ unknowns:

$\begin{aligned} a_{11}x_1 + a_{12} x_2 + \cdots + a_{1n} x_n &= b_1\\ &\vdots\\ a_{n1} x_1 + a_{n2} x_2 + \cdots + a_{nn} x_n &= b_n \end{aligned}$ can be rewritten using matrix notation as

$\begin{pmatrix} a_{11} & \cdots & a_{1 n} \\ \vdots & \ddots & \vdots\\ a_{n1} & \cdots & a_{n n} \end{pmatrix} \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix} = \begin{pmatrix} b_1 \\ \vdots \\ b_n \end{pmatrix}$ or more compactly as $A x = b$ .

If $A$ is invertible, then the associative law gives the following:

$\begin{aligned} A x &= b \iff\\ A^{-1} \left(A x\right) &= A^{-1} b \iff \\ (A^{-1} A) x &= A^{-1} b \iff\\ I x &= A^{-1} b\iff\\ x &= A^{-1} b.\\ \end{aligned}$

The inverse matrix gives the solution to the linear equations $A x = b$ just by one matrix multiplication!

The system of linear equations

$\begin{matrix} &5 x &+ &3 y &= &13\\ &3 x &+&2 y &= &8 \end{matrix} \tag{3.14}$ can be rewritten using matrix multiplication to

$A v = b,$ where

$A = \begin{pmatrix} 5& 3 \\ 3 & 2 \end{pmatrix}, \qquad v = \begin{pmatrix} x \\ y \end{pmatrix}\qquad \text{and}\qquad b = \begin{pmatrix} 13 \\ 8 \end{pmatrix}.$

Here $A$ is invertible and

$A^{-1} = \begin{pmatrix} 2 & -3\\ -3 & 5 \end{pmatrix}.$ One simple matrix multiplication

$\begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} 2 & -3\\ -3 & 5 \end{pmatrix} \begin{pmatrix} 13 \\ 8 \end{pmatrix} = \begin{pmatrix} 2 \\ 1 \end{pmatrix}$ shows the solution we expect from (3.14).

The product of two invertible matrices (when this makes sense) is an invertible matrix. This is the content of the following result.

The product $A B$ of two invertible matrices $A$ and $B$ is invertible and $(A B)^{-1} = B^{-1} A^{-1}$ .

We must check that

$(B^{-1} A^{-1}) (A B) = I\qquad\text{and}\qquad A B (B^{-1} A^{-1}) = I.$ Let us check the first condition using the associative law:

$\begin{aligned} (B^{-1} A^{-1}) (A B) &= ((B^{-1} A^{-1}) A) B\\ &= (B^{-1} (A^{-1} A)) B \\ &= (B^{-1} I) B = B^{-1} (I B) = B^{-1} B = I, \end{aligned}$ where $I$ denotes the identity matrix. The condition $A B (B^{-1} A^{-1}) = I$ is verified in the same way.

We have defined a matrix $A$ to be invertible if there exists a matrix $B$ , such that $A B = I$ and $B A = I$ . Suppose that only $B A = I$ . Can we then conclude that $A B = I$ ?

Find the mistake in the argument below.

Suppose that $B A = I$ . Then for every $y\in \mathbb{R}^n$ we have $A x = y \implies x = (B A) x = B y$ . Therefore $A (B y) = (A B) y = y$ for every $y\in \mathbb{R}^n$ and we have proved that $A B = I$ .

Let

$N = \begin{pmatrix} 0 & 1 & 1 & 1\\ 0 & 0 & 1 & 1\\ 0 & 0 & 0 & 1\\ 0 & 0 & 0 & 0 \end{pmatrix}.$ Compute the powers $N^k$ for $k\geq 2$ i.e., $N^2, N^3, \dots$ . Now let

$A = I + N,$ where $I = I_4$ . Show that $A$ is invertible, and

$A^{-1} = I - N + N^2 - N^3.$ Compute $A^{-1}$ .

Do you see a way of generalizing this computation to $n\times n$ matrices $N$ with a property shared by the $4\times 4$ matrix above?

3.5.1 Well, how do I find the inverse of a matrix?

Finding the inverse of a matrix or deciding that the matrix is not invertible is a matter of solving systems of linear equations.

Given an $n\times n$ matrix $A$ , we need to see if there exists an $n\times n$ matrix $B$ , such that

$A B = I, \tag{3.15}$ where $I = I_n$ is the identity matrix of order $n$ . We can do this by computing the columns of $B$ . From the definition in (3.15), the $j$ -th column $B^j$ of $B$ must satisfy

$A B^j = I^j. \tag{3.16}$ This follows from the definition of matrix multiplication!

The identity in (3.16) is a system of $n$ linear equations in $n$ unknowns. The unknowns are the entries in the $j$ -th column $B^j$ of the inverse matrix $A^{-1}$ (if it exists).

Suppose that $A$ is a $2\times 2$ matrix. Then the inverse matrix $B$ (if it exists) can be computed from the systems of linear equations below.

$A B^1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix}\qquad\text{and}\qquad A B^2 = \begin{pmatrix} 0 \\ 1 \end{pmatrix}.$ Writing

$\begin{pmatrix} x \\ y \end{pmatrix} = B^1\qquad\text{and}\qquad \begin{pmatrix} u \\ v\end{pmatrix} = B^2$ for the first and second columns, the systems of linear equations can be written as

$\begin{matrix} &A_{11} x &+ &A_{12} y &= &1\\ &A_{21} x &+ &A_{22} y &= &0 \end{matrix}\qquad\text{and}\qquad \begin{matrix} &A_{11} u &+ &A_{12} v &= &0\\ &A_{21} u &+ &A_{22} v &= &1 \end{matrix},$ where

$B = \begin{pmatrix} x & u\\ y & v \end{pmatrix}.$ A concrete example along with a useful way of keeping track of the computation is presented in the video below.

Compute the inverse of the matrix

$A = \begin{pmatrix} 1 & 1 & 1\\ 1 & 2 & 1\\ 1 & 1 & 3 \end{pmatrix}$ by employing the method of solving linear equations above. Explain the steps in your computation. You may find it useful to collect inspiration from the video in Example 3.30.

3.6 The transposed matrix

The transpose of an $m\times n$ matrix $A$ is the $n\times m$ matrix $A^\top$ given by

$A^\top_{i j} = A_{j i}.$ As an example, we have

$\begin{pmatrix} 0 & 2 & 4 & -2\\ 3 & 2 & 7 & 4 \end{pmatrix}^\top = \begin{pmatrix} 0 & 3\\ 2 & 2\\ 4 & 7\\ -2 & 4 \end{pmatrix}.$ Notice also that $(A^\top)^\top = A$ for an arbitrary matrix $A$ .

Let $A$ be an $m\times r$ matrix and $B$ an $r\times n$ matrix. Then

$(A B)^\top = B^\top A^\top.$

By definition $(A B)^\top_{i j} = (A B)_{j i}$ . This entry is given by row-column multiplication of the $j$ -th row in $A$ and the $i$ -th column in $B$ , which is the row-column multiplication of the $i$ -th row in $B^\top$ and the $j$ -th column in $A^\top$ .

Let $A$ be a quadratic matrix. Prove that $A$ is invertible if and only if $A^\top$ is invertible.

In the sage window below, you are supposed to experiment a bit by entering an arbitrary matrix $B$ and studying the quadratic matrix $B B^\top$ . Is there anything special about this product? Press the Further explanation button below the sage window to display the rest of the exercise after(!) you have completed your experimentation.

Further explanation

A quadratic matrix $A$ is called symmetric if $A = A^\top$ . Prove that

$B B^\top$ is a symmetric matrix, where $B$ is an arbitrary matrix.

3.7 Symmetric matrices

A (quadratic) matrix $A$ is called symmetric if $A = A^\top$ . Visually, this means that $A$ is symmetric around the diagonal like the $3\times 3$ matrix

$\begin{pmatrix} 1 & \color{blue}{2} & 3\\ \color{blue}{2} & 5 & 4\\ 3 & 4 & 6 \end{pmatrix},$ but not like the $3\times 3$ matrix

$\begin{pmatrix} 1 & \color{blue}{2} & 3\\ \color{red}{4} & 5 & 6\\ 7 & 8 & 9 \end{pmatrix}.$

Show that

$B^\top A B$ is a symmetric matrix, when $A$ is a symmetric matrix and $B$ is an arbitrary matrix. Both matrices are assumed quadratic of the same dimensions.

If $A$ is a symmetric $n\times n$ matrix, we define the function $f_A: \mathbb{R}^n\rightarrow \mathbb{R}$ given by

$f_A(v) = v^\top A v.$

This definition is rather compact. Let us consider the following example for $n=2$ .

$A = \begin{pmatrix} a & c \\ c & b \end{pmatrix}\quad\text{and}\quad v = \begin{pmatrix} x \\ y \end{pmatrix}.$ Then

$\begin{aligned} v^\top A v &= \begin{pmatrix} x & y \end{pmatrix} \begin{pmatrix} a & c \\ c & b \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} x & y \end{pmatrix} \begin{pmatrix} a x + c y \\ c x + b y \end{pmatrix}\\ &=x ( a x + c y) + y (c x + b y) = a x^2 + b y^2 + 2 c x y. \end{aligned}$

You are also encouraged to watch the short video below for an example with concrete numbers.

Mentimeter

Quadratic form quiz

Inside the set of the symmetric matrices we find two very important subsets of matrices: the positive definite and the positive semi-definite matrices. They correspond to positive and non-negative real numbers.

3.7.1 Positive definite matrices

A symmetric matrix $A$ is called positive definite if

$f_A(v) > 0$ for every $v\in \mathbb{R}^n\setminus\{0\}$ . Probably the first example of a positive definite $2\times 2$ -matrix one thinks of is $A$ being the identity matrix. Here

$\begin{pmatrix} x & y \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = x^2 + y^2.$

Of course, here $x^2 + y^2 = 0$ if and only if $x = y = 0$ or $\begin{pmatrix} x\\ y \end{pmatrix} = 0$ .

Give examples of (non-zero) $1\times 1$ and $2\times 2$ matrices that are positive definite and ones that fail to be positive definite.

When is a $2\times 2$ diagonal matrix positive definite?

Let $A$ be a symmetric $n\times n$ matrix. Show that $A$ is not positive definite if $A_{11} < 0$ .

3.7.2 Positive semi-definite matrices

A symmetric matrix $A$ is called positive semi-definite if

$f_A(v) \geq 0$ for every $v\in \mathbb{R}^n$ . A positive definite matrix is positive semi-definite. Probably the first example of a non positive definite, but positive semi-definite $2\times 2$ -matrix one thinks of is $A$ being the zero matrix. Here

$\begin{pmatrix} x & y \end{pmatrix} \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = 0,$

for every $x, y\in \mathbb{R}$ .

Give an example of a non-zero matrix that is positive semi-definite, but not positive definite.

When is a $2\times 2$ diagonal matrix positive semi-definite?

3.7.3 Symmetric reductions

As you probably have noticed, it is rather straightforward to see when a diagonal matrix is positive (semi)definite. For a general symmetric matrix, one needs to reduce to the case of a diagonal matrix. This is done using the following result.

Let $A$ be a symmetric $n\times n$ matrix and $B$ an invertible $n\times n$ matrix. Then $A$ is positive (semi) definite if and only if

$B^\top A B$ is positive (semi) definite.

Every vector $v\in \mathbb{R}^n$ is equal to $B u$ for a unique $u\in \mathbb{R}^n$ , since $B$ is invertible. Why? The upshot is that the equation

$v = B u$ can be solved by multiplying both sides by $B^{-1}$ giving

$v = B u \iff B^{-1} v = B^{-1} (B u) = (B^{-1} B) u = u.$

So we get

$v^\top A v = (B u)^\top A (B u) = u^\top (B^\top A B) u.$ This computation shows that $A$ is positive (semi) definite if $B^\top A B$ is positive semi-definite. The same reasoning with $u = B^{-1} v$ shows that $B^\top A B$ is positive (semi) definite if $A$ is positive (semi) definite.

Notice that it is important that $B v = 0$ only happens when $v=0$ .

Let

$D = \begin{pmatrix} d & 0\\ 0 & e \end{pmatrix}$ be a diagonal matrix. What conditions must the diagonal entries $d$ and $e$ satisfy in order for $D$ to be positive definite?

Let

$A = \begin{pmatrix} a & c\\ c & b \end{pmatrix}$ denote a symmetric $2\times 2$ matrix, where $a\neq 0$ . Let

$B = \begin{pmatrix} 1 & -\frac{c}{a}\\ 0 & 1 \end{pmatrix}.$ Show that $B$ is invertible and compute

$B^\top A B.$ Use this to show that $A$ is positive definite if and only if $a>0$ and $a b - c^2 > 0$ .

Let $f:\mathbb{R}^2\rightarrow \mathbb{R}$ be the function defined by

$f(x, y) = 2 x^2 + 3 y^2 + 4 x y.$ Show that $f(x, y)\geq 0$ for every $x, y\in \mathbb{R}$ .

3Matrices

3.1 Matrices

3.1.1 Definitions

3.2 Linear maps

3.3 Matrix multiplication

3.3.1 Matrix multiplication in numpy

3.3.2 The identity matrix

3.3.3 Examples of matrix multiplication

3.4 Matrix arithmetic

3.4.1 Matrix addition

3.4.2 Multiplication of a number and a matrix

3.4.3 The distributive law

3.4.4 The miraculous associative law

3.5 The inverse matrix

3.5.1 Well, how do I find the inverse of a matrix?

3.6 The transposed matrix

3.7 Symmetric matrices

3.7.1 Positive definite matrices

3.7.2 Positive semi-definite matrices

3.7.3 Symmetric reductions

3.3.1 Matrix multiplication in `numpy`