Subsection 12.1.1 Solutions
The method of solving systems of equations by matrices that we will look at is based on procedures involving equations that we are familiar with from previous mathematics courses. The main idea is to reduce a given system of equations to another simpler system that has the same solutions.
Definition 12.1.1. Solution Set.
Given a system of equations involving real variables \(x_1\text{,}\) \(x_2, \ldots \text{,}\) \(x_n\text{,}\) the solution set of the system is the set of \(n\)-tuples in \(\mathbb{R}^n\text{,}\) \(\left(a_1, a_2, \ldots ,a_n\right)\) such that the substitutions \(x_1= a_1\text{,}\) \(x_2=
a_2, \ldots\text{,}\) \(x_n= a_n\) make all the equations true.
In terms of logic, a solution set is a truth set of a system of equations, which is a proposition over \(n\)-tuples of real numbers.
In general, if the variables are from a set \(S\text{,}\) then the solution set will be a subset of \(S^n\text{.}\) For example, in number theory mathematicians study Diophantine equations, where the variables can only take on integer values instead of real values.
Definition 12.1.2. Equivalent Systems of Equations.
Two systems of linear equations are called equivalent if they have the same set of solutions.
Example 12.1.3. Two equivalent systems.
The previous definition tells us that if we know that the system
\begin{equation*}
\begin{array}{l}
4 x_1+2 x_2+x_3=1 \\
2 x_1+x_2+x_3=4 \\
2 x_1+2 x_2+x_3=3 \\
\end{array}
\end{equation*}
is equivalent to the system
\begin{equation*}
\begin{array}{l}
\text{ }x_1+0 x_2+0x_3=-1 \\
0 x_1+x_2\text{ }+0x_3=-1 \\
0 x_1+0 x_2\text{ }+x_3= 7 \\
\end{array}
\end{equation*}
then both systems have the solution set \(\{(-1, -1, 7)\}\text{.}\) In other words, the simultaneous values \(x_1=-1\text{,}\) \(x_2= -1\text{,}\) and \(x_3= 7\) are the only values of the variables that make all three equations in either system true.
Subsection 12.1.2 Elementary Operations on Equations
Theorem 12.1.4. Elementary Operations on Equations.
If any sequence of the following operations is performed on a system of equations, the resulting system is equivalent to the original system:
Interchange any two equations in the system.
Multiply both sides of any equation by a nonzero constant.
Multiply both sides of any equation by a nonzero constant and add the result to a second equation in the system, with the sum replacing the latter equation.
Let us now use the above theorem to work out the details of
Example 12.1.3 and see how we can arrive at the simpler system.
The original system:
\begin{gather}
\begin{array}{c}
4 x_1+2 x_2+x_3=1 \\
2 x_1+x_2+x_3=4 \\
2 x_1+2 x_2+x_3=3
\end{array}\tag{12.1.1}
\end{gather}
Step 1. We will first change the coefficient of \(x_1\) in the first equation to one and then use it as a pivot to obtain 0’s for the coefficients of \(x_1\) in Equations 2 and 3.
Multiply Equation 1 by \(\frac{1}{4}\) to obtain
\begin{gather}
\begin{array}{c}
x_1+\frac{x_2}{2}+\frac{x_3}{4}=\frac{1}{4}\\
2 x_1+x_2+x_3=4 \\
2 x_1+2 x_2+x_3=3\\\\
\end{array}\tag{12.1.2}
\end{gather}
Multiply Equation 1 by \(-2\) and add the result to Equation 2 to obtain
\begin{gather}
\begin{array}{c}
x_1+\frac{x_2}{2}+\frac{x_3}{4}=\frac{1}{4} \\
0 x_1 + 0 x_2+ \frac{x_3}{2}=\frac{7}{2} \\
2 x_1+2 x_2+x_3=3 \\\\
\end{array}\tag{12.1.3}
\end{gather}
Multiply Equation 1 by \(-2\) and add the result to Equation 3 to obtain
\begin{gather}
\begin{array}{c}
x_1+\frac{x_2}{2}+\frac{x_3}{4}=\frac{1}{4} \\
0 x_1 + 0 x_2+ \frac{x_3}{2}=\frac{7}{2} \\
0 x_1+ x_2+ \frac{x_3}{2}=\frac{5}{2} \\\\
\end{array}\tag{12.1.4}
\end{gather}
We’ve explicitly written terms with zero coefficients such as \(0 x_1\) to make a point that all variables can be thought of as being involved in all equations. After this example is complete, we will discontinue this practice in favor of the normal practice of making these terms “disappear.”
Step 2. We would now like to proceed in a fashion analogous to Step 1; namely, multiply the coefficient of \(x_2\) in the second equation by a suitable number so that the result is 1. Then use it as a pivot to obtain 0’s as coefficients for \(x_2\) in the first and third equations. This is clearly impossible (Why?), so we will first interchange Equations 2 and 3 and proceed as outlined above.
Exchange Equations 2 and 3 to obtain
\begin{gather}
\begin{array}{c}
x_1+\frac{x_2}{2}+\frac{x_3}{4}=\frac{1}{4} \\
0 x_1+ x_2+ \frac{x_3}{2}=\frac{5}{2} \\
0 x_1 + 0 x_2+ \frac{x_3}{2}=\frac{7}{2} \\\\
\end{array}\tag{12.1.5}
\end{gather}
Multiply Equation 2 by \(\frac{1}{2}\) and subtract the result from Equation 1 to obtain
\begin{gather}
\begin{array}{c}
x_1+ 0x_2 + 0x_3 = -1 \\
0 x_1+ x_2+ \frac{x_3}{2}=\frac{5}{2} \\
0 x_1 + 0 x_2+ \frac{x_3}{2}=\frac{7}{2} \\\\
\end{array}\tag{12.1.6}
\end{gather}
Step 3. Next, we will change the coefficient of \(x_3\) in the third equation to one and then use it as a pivot to obtain 0’s for the coefficients of \(x_3\) in Equations 1 and 2. Notice that the coefficient of \(x_3\) is already zero in Equation 1, so we have been saved some work!
Multiply Equation 3 by \(2\) to obtain
\begin{equation*}
\begin{array}{c}
x_1+ 0x_2 + 0x_3 = -1 \\
0 x_1+ x_2+ \frac{x_3}{2}=\frac{5}{2} \\
0 x_1 + 0 x_2+ x_3=7
\end{array}
\end{equation*}
Multiply Equation 3 by \(-1/2\) and add the result to Equation 2 to obtain
\begin{gather}
\begin{array}{c}
x_1+ 0x_2 + 0x_3 = -1 \\
0 x_1+ x_2+ 0 x_3=-1 \\
0 x_1 + 0 x_2+ x_3=7 \\
\end{array}\tag{12.1.7}
\end{gather}
From the system of equations at the end of Step 3, we see that the solution to the original system is \(x_1=-1\text{,}\) \(x_2= -1\text{,}\) and \(x_3= 7\text{.}\)
Subsection 12.1.4 Elementary Row Operations
Theorem 12.1.5. Elementary Row Operations.
If any sequence of the following operations is performed on the augmented matrix of a system of equations, the resulting matrix is a system that is equivalent to the original system. The following operations on a matrix are called elementary row operations:
Exchange any two rows of the matrix.
Multiply any row of the matrix by a nonzero constant.
Multiply any row of the matrix by a nonzero constant and add the result to a second row, with the sum replacing that second row.
Definition 12.1.6. Row Equivalent Matrices.
Two matrices, \(A\) and \(B\text{,}\) are said to be row-equivalent if one can be obtained from the other by any sequence of zero or more elementary row operations.
If we use the notation \(R_i\) to stand for Row \(i\) of a matrix and \(\longrightarrow\) to stand for row equivalence, then
\begin{equation*}
A \overset{c R_i+ R_j}{\longrightarrow }B
\end{equation*}
means that the matrix \(B\) is obtained from the matrix \(A\) by multiplying the Row \(i\) of \(A\) by \(c\) and adding the result to Row \(j\text{.}\) The operation of multiplying row \(i\) by \(c\) is indicated by
\begin{equation*}
A \overset{c R_i}{\longrightarrow }B
\end{equation*}
while exchanging rows \(i\) and \(j\) is denoted by
\begin{equation*}
A \overset{R_i\leftrightarrow R_j}{\longrightarrow }B\text{.}
\end{equation*}
The matrix notation for the system given in our first example, with the subsequent steps, is:
\begin{equation*}
\begin{split}
\left(
\begin{array}{ccc|c}
4 & 2 & 1 & 1 \\
2 & 1 & 1 & 4 \\
2 & 2 & 1 & 3 \\
\end{array}
\right) & \overset{\frac{1}{4} R_1}{\text{ }\longrightarrow }\text{ }\left(
\begin{array}{ccc|c}
1 & \frac{1}{2} & \frac{1}{4} & \frac{1}{4} \\
2 & 1 & 1 & 4 \\
2 & 2 & 1 & 3 \\
\end{array}
\right) \text{ }\overset{-2 R_1+ R_2}{\text{ }\longrightarrow }\text{ }\left(
\begin{array}{ccc|c}
1 & \frac{1}{2} & \frac{1}{4} & \frac{1}{4} \\
0 & 0 & \frac{1}{2} & \frac{7}{2} \\
2 & 2 & 1 & 3 \\
\end{array}
\right) \\
& \overset{-2 R_1+ R_3}{\text{ }\longrightarrow }\text{ }\left(
\begin{array}{ccc|c}
1 & \frac{1}{2} & \frac{1}{4} & \frac{1}{4} \\
0 & 0 & \frac{1}{2} & \frac{7}{2} \\
0 & 1 & \frac{1}{2} & \frac{5}{2} \\
\end{array}
\right)\text{ }\overset{R_2\leftrightarrow R_3}{\text{ }\longrightarrow }\text{ }\left(
\begin{array}{ccc|c}
1 & \frac{1}{2} & \frac{1}{4} & \frac{1}{4} \\
0 & 1 & \frac{1}{2} & \frac{5}{2} \\
0 & 0 & \frac{1}{2} & \frac{7}{2} \\
\end{array}
\right) \\
& \overset{-\frac{1}{2} R_2+ R_1}{\text{ }\longrightarrow }\text{ }\left(
\begin{array}{ccc|c}
1 & 0 & 0 & -1 \\
0 & 1 & \frac{1}{2} & \frac{5}{2} \\
0 & 0 & \frac{1}{2} & \frac{7}{2} \\
\end{array}
\right)\text{ }\overset{2 R_3}{\text{ }\longrightarrow }\text{ }\left(
\begin{array}{ccc|c}
1 & 0 & 0 & -1 \\
0 & 1 & \frac{1}{2} & \frac{5}{2} \\
0 & 0 & 1 & 7 \\
\end{array}
\right) \text{ }\\
& \overset{-\frac{1}{2} R_3+ R_2}{\text{ }\longrightarrow }\text{ }\left(
\begin{array}{ccc|c}
1 & 0 & 0 & -1 \\
0 & 1 & 0 & -1 \\
0 & 0 & 1 & 7 \\
\end{array}
\right)
\end{split}
\end{equation*}
This again gives us the solution. This procedure is called the Gauss-Jordan elimination method.
It is important to remember when solving any system of equations via this or any similar approach that at any step in the procedure we can rewrite the matrix in “equation format” to help us to interpret the meaning of the augmented matrix.
In our first example we found a unique solution, only one triple, namely \((-1,-1,7)\text{,}\) which satisfies all three equations. For a system involving three unknowns, are there any other possible results? To answer this question, let’s review some basic facts from analytic geometry.
The graph of a linear equation in three-dimensional space is a plane. So geometrically we can visualize the three linear equations as three planes in three-space. Certainly the three planes can intersect in a unique point, as in the first example, or two of the planes could be parallel. If two planes are parallel, there are no common points of intersection; that is, there are no triple of real numbers that will satisfy all three equations. Another possibility is that the three planes could intersect along a common axis or line. In this case, there would be an infinite number of real number triples in \(\mathbb{R}^3\text{.}\) Yet another possibility would be if the first two planes intersect in a line, but the line is parallel to, but not on, the third plane, giving us no solution. Finally if all three equations describe the same plane, the solution set would be that plane.
We can generalize these observations. In a system of \(n\) linear equations, \(n\) unknowns, there can be
a unique solution,
no solution, or
an infinite number of solutions.
To illustrate these points, consider the following examples:
Example 12.1.7. A system with no solutions.
Find all solutions to the system
\begin{equation*}
\begin{array}{l}
\text{ }x_1 +3 x_2\text{ }+x_3=2 \\
\text{ }x_1\text{ }+x_2 +5 x_3=4 \\
2 x_1+2 x_2+10 x_3=6 \\
\end{array}
\end{equation*}
The reader can verify that the augmented matrix of this system, \(\left(
\begin{array}{ccc|c}
1 & 3 & 1 & 2 \\
1 & 1 & 5 & 4 \\
2 & 2 & 10 & 6 \\
\end{array}
\right)\text{,}\) reduces to \(\left(
\begin{array}{ccc|c}
1 & 3 & 1 & 2 \\
1 & 1 & 5 & 4 \\
0 & 0 & 0 & -2 \\
\end{array}
\right)\text{.}\)
We can attempt to row-reduce this matrix further if we wish. However, any further row-reduction will not substantially change the last row, which, in equation form, is \(0x_1+ 0x_2+0x_3 = -2\text{,}\) or simply \(0=-2\text{.}\) It is clear that we cannot find real numbers \(x_1\text{,}\) \(x_2\text{,}\) and \(x_3\) that will satisfy this equation. Hence we cannot find real numbers that will satisfy all three original equations simultaneously. When this occurs, we say that the system has no solution, or the solution set is empty.
Example 12.1.8. A system with an infinite number of solutions.
Next, let’s attempt to find all of the solutions to:
\begin{equation*}
\begin{array}{l}
\text{ }x_1+6 x_2+2 x_3=1 \\
2 x_1\text{ }+x_2+3 x_3 =2 \\
4 x_1+2 x_2+6 x_3=4 \\
\end{array}
\end{equation*}
The augmented matrix for the system is
\begin{align}
\left(
\begin{array}{ccc|c}
1 & 6 & 2 & 1 \\
2 & 1 & 3 & 2 \\
4 & 2 & 6 & 4 \\
\end{array}
\right)\tag{12.1.8}
\end{align}
which reduces to
\begin{align}
\left(
\begin{array}{ccc|c}
1 & 0 & \frac{16}{11} & 1 \\
0 & 1 & \frac{1}{11} & 0 \\
0 & 0 & 0 & 0 \\
\end{array}
\right)\tag{12.1.9}
\end{align}
If we apply additional elementary row operations to this matrix, it will only become more complicated. In particular, we cannot get a one in the third row, third column. Since the matrix is in simplest form, we will express it in equation format to help us determine the solution set.
\begin{gather}
\begin{array}{l}
x_1 \quad+\frac{16}{11} x_3 =1 \\
\quad x_2+\frac{1}{11}x_3 =0\\
\quad\quad\quad \quad 0=0 \\
\end{array}\tag{12.1.10}
\end{gather}
Any real numbers will satisfy the last equation. However, the first equation can be rewritten as \(x_1 =1-\frac{16 }{11}x_3\text{,}\) which describes the coordinate \(x_1\) in terms of \(x_3\) . Similarly, the second equation gives \(x_2\) in terms of \(x_3\) . A convenient way of listing the solutions of this system is to use set notation. If we call the solution set of the system \(S\text{,}\) then
\begin{equation*}
S = \left\{\left(1-\frac{16}{11}x_3, -\frac{1}{11}x_3, x_3\right) \mid x_3\in \mathbb{R}\right\}\text{.}
\end{equation*}
What this means is that if we wanted to list all solutions, we would replace \(x_3\) by all possible numbers. Clearly, there is an infinite number of solutions, two of which are \((1, 0, 0)\) and \((-15, -1, 11)\text{,}\) when \(x_3\) takes on the values 0 and 11, respectively.
A Word Of Caution: Frequently we may can get “different-looking” answers to the same problem when a system has an infinite number of solutions. Assume the solutions set in this example is reported to be \(A = \left\{\left(1+16x_2, x_2, -11x_3\right) \mid x_3\in \mathbb{R}\right\}\text{.}\) Certainly the result described by \(S\) looks different from that described by \(A\text{.}\) To see whether they indeed describe the same set, we wish to determine whether every solution produced in \(S\) can be generated in \(A\text{.}\) For example, the solution generated by \(S\) when \(x_3=11\) is \((-15, -1, 11)\text{.}\) The same triple can be produced by \(A\) by taking \(x_2= -1\text{.}\) We must prove that every solution described in \(S\) is described in \(A\) and, conversely, that every solution described in \(A\) is described in \(S\text{.}\) (See Exercise 6 of this section.)
To summarize the procedure in the Gauss-Jordan technique for solving systems of equations, we attempt to obtain 1’s along the main diagonal of the coefficient matrix with 0’s above and below the diagonal. We may find in attempting this that this objective cannot be completed, as in the last two examples we have seen. Depending on the way we interpret the results in equation form, we either recognize that no solution exists, or we identify “free variables” on which an infinite number of solutions are based. The final matrix forms that we have produced in our examples are referred to as echelon forms.
In practice, larger systems of linear equations are solved using computers. Generally, the Gauss-Jordan algorithm is the most useful; however, slight variations of this algorithm are also used. The different approaches share many of the same advantages and disadvantages. The two major concerns of all methods are:
minimizing inaccuracies due to round-off errors, and
minimizing computer time.
Subsection 12.1.6 SageMath Note - Matrix Reduction
Given an augmented matrix,
\(C\text{,}\) there is a matrix method called
echelon_form
that can be used to row reduce
\(C\text{.}\) Here is the result for the system in
Example 12.1.11. In the assignment of a matrix value to
\(C\text{,}\) notice that the first argument is QQ, which indicates that the entries should be rational numbers. As long as all the entries are rational, which is the case here since integers are rational, the row-reduced matrix will be all rational.
If we don’t specify the set from which entries are taken, it would assumed to be the integers and we do not get a fully row-reduced matrix. This is because the next step in working with the next output would involve multiplying row 2 by \(\frac{1}{2}\) and row 3 by \(\frac{1}{9}\text{,}\) but these multipliers are not integers.
If we specifying real entries, the result isn’t as nice and clean as the rational output.
The default number of decimal places may vary from what you see here, but it can be controlled. The single small number in row three column four isn’t exactly zero because of round-off but we could just set it to zero.