Coding the Gaussian Elimination Algorithm

Systems of linear equations are one of the foundations of linear algebra.

At its core, a system of linear equations is a set of two or more linear equations (of the form Ax + By = C ) that share the same variables. The solution to a system is the set of values for the variables such that each linear equation is simultaneously true.

Much of the field of linear algebra evolved to solve large systems faster. You’ll see these systems applied in the real world across quantitative fields such as physics, engineering, machine learning, computer graphics, operations research, and more.

Due to its dispersed presence, efficient problem-solving algorithms are crucial.

The primary problem-solving algorithm you learn in a college-level algebra course is Gaussian Elimination.

It’s not the most optimal method for solving large systems of linear equations, but it’s a fun and useful algorithm to learn.

Gaussian Elimination solves systems of linear equations using matrices. The process leverages elementary row operations to transform an augmented matrix into a row equivalent matrix in reduced row echelon form. The algorithm consists of two phases: forward elimination and backward elimination. We call the forward phase Gaussian Elimination and the backward phase Gauss-Jordan elimination.

I’m not entirely sure of the mathematically appropriate way to name this process, so I denote it as Gaussian Elimination here. Correct me if I’m wrong.

The forward elimination step transforms a matrix into row echelon form (REF).

A matrix is said to be of row echelon form if the following properties are true:

All zero rows, if any, are below all nonzero rows.
Each leading nonzero entry (pivot) of a row is in a column to the right of the preceding row’s leading entry.
All entries below each pivot are zeros.

Understanding REF matters because the backward elimination process goes one step further by transforming a matrix into reduced row echelon form (RREF). For RREF, the REF properties hold, while the following are also true:

Each pivot is equal to 1.
Each pivot is the only nonzero entry in its column (there are only zeros above and below each pivot).

Before explaining the Gaussian Elimination algorithm, there are a few concepts you need a high-level understanding of.

The first is row equivalence.

As mentioned, the forward and backward elimination processes transform a matrix into a row equivalent matrix in REF or RREF. Two matrices A and B are said to be row equivalent (A ~ B) if you can obtain B from A by executing a sequence of elementary row operations.

The three elementary row operations are:

Row Swapping: Switching the position of two rows in a matrix. (Useful for moving zero rows below all nonzero rows)
Row Scaling: Multiplying a row by a nonzero constant. (Useful for setting pivot entries equal to 1)
Row Replacement: Adding a multiple of one row to another. (Useful for eliminating entries below or above pivots)

In essence, an elementary row operation transforms rows using linear combinations of other rows. By definition, all linear combinations of vectors within a vector space also exist within the vector space. Using this principle, we see that applying a sequence of elementary row operations to a matrix results in a matrix whose rows are different expressions of the original rows, while spanning the same row space.

So, row equivalent matrices are simply different representations of the same linear system, hence why we can use them to solve systems of linear equations.

Okay, onto the Gaussian Elimination algorithm…

Below is the full process for executing the algorithm that transforms a matrix into row echelon form and reduced row echelon form.

Phase 1: Forward Elimination (Gaussian Elimination)
- Step 1: Identify the pivot
  - Start with the first column and top row
  - Find the first nonzero entry in that column
  - Swap rows, if needed, so the first nonzero entry is at the top.
  - If no rows have a nonzero entry in this column, move one column to the right and repeat this step
- Step 2: Create zeros for each element below the pivot
  - Use row replacement to eliminate nonzero entries in the pivot column for all rows below the current pivot row
- Step 3: Move to the next pivot column
  - Move one row down and one column to the right
- Step 4: Repeat
  - Repeat steps 1-3 until you run out of rows or all remaining rows are zeros
Phase 2: Backward Elimination (Gauss-Jordan Elimination)
- Step 1:
  - Start at the bottom-most pivot and use row scaling to set the pivot equal to 1 (if it’s not already 1)
- Step 2:
  - Use row replacement to make every entry above the pivot equal to 0
- Step 3:
  - Move to the next pivot above, and repeat the process until all pivots are equal to 1 and isolated in their columns

That’s the gist of Gaussian Elimination. It’s a simple set of steps to solve systems of linear equations using matrices.

I write this post because, as I progress through my self-study math journey, I aim to understand concepts deeply. I want to go beyond a high-level understanding and move towards a more intimate one, where I know how/why math concepts work the way they do and how they’re used in real-world applications.

I believe the best way to do this is by translating math concepts into code.

Doing calculations and solving problems by hand is crucial for learning, so I don’t skip that step. But by turning problem-solving techniques into code, I force myself to learn the inner workings of math concepts.

Seeing as solving systems of linear equations matters greatly to linear algebra and applied mathematics, writing a Gaussian Elimination algorithm in Python seemed like a perfect place to start.

I didn’t source control my Python script, so you cannot view it on GitHub. But as you’ll see at the end of this post, I’m working on a new project that will include the Gaussian Elimination algorithm. So, you can eventually view that on GitHub.

For now, I’ll break down what I wrote and include code snippets for all my functions…

The first function in my script verifies if the input matrix is valid. A matrix requires each row to have an equal number of columns, so my validation function checks for that.

def isValidMatrix(m: list[list[float]]) -> bool:
    rn = len(m) # Num of rows
    cn = len(m[0]) # Num of columns
    for r in range(1,len(m)): # skip first row since we check it two lines above
        if len(m[r]) != cn:
            raise ValueError("Invalid Matrix Dimension")
        
    return True

The input for isValidMatrix is of type list[list[float]]. Python’s numpy library already has a matrix object, but since I want to develop all of the logic on my own, I wrote this script without any libraries. So the embedded list input lets me create a pseudo-matrix type.

The outer list represents the matrix. The inner list represents each row. And the floats within the inner list represent a row’s entry for all columns. The rest of the functions I share below will also use the same pseudo-matrix type for inputs.

After writing the matrix validation function, I created functions for each elementary row operation (row swapping, row scaling, and row replacement). I could have put the logic for these within my REF and RREF finding functions, but separating them makes my program modular and the core functions easier to read.

def RowSwap(m: list[list[float]], r1: int, r2: int) -> list[list[float]]:
    """
    Parameters:
    -> m: list[list[float]] --> Represents matrix whose rows we want to swap
    -> r1: int --> Reprsents index for upper row
    -> r2: int --> Represents index for lower row

    Output:
    -> We output a matrix row equivalent to m
    """
    m2 = m
    m2[r1], m2[r2] = m2[r2], m2[r1]

    return m2
    
def RowScale(m: list[list[float]], r: int) -> list[list[float]]:
    """
    Parameters:
    -> m: list[list[float]] --> Represents matrix whose rows we want to swap
    -> r: int --> Reprsents index for target row to scale

    Output:
    -> A matrix row equivalent to m, with target row scaled at pivot
    """
    # Find first non-zero entry of list and divide all elements by it
    f = None
    for i in m[r]:
        if i != 0:
            f = i
            break
    m2 = m
    m2[r] = [i/f if i != 0 else 0 for i in m2[r]]

    return m2
    
def RowReplacement(m: list[list[float]], r1: int, r2: int) -> list[list[float]]:
    """
    Parameters:
    -> m: list[list[float]] --> Represents matrix whose rows we want to apply a linear combination to
    -> r1: int --> Reprsents index for target row (one being changed)
    -> r2: int --> Represents index for source/pivot row (one used to modify target row)
    
    Output:
    -> A matrix row equivalent to m
    """
    
    # First, we check if value below source pivot col is 0. If yes, we don't need to do anything
    spIndex = 0
    for i in range(len(m[r2])):
        if m[r2][i] != 0:
            spIndex = i
            break
    if m[r1][spIndex] == 0:
        return m
    
    m2 = [row for row in m] # Initialize output matrix
            
    k = abs(m2[r1][spIndex])
    # Check if target/source have same sign at target pivot index
    if (m2[r1][spIndex] > 0 and m2[r2][spIndex] > 0) or (m2[r1][spIndex] < 0 and m2[r2][spIndex] < 0):
        k *= -1
    
    # Build the new Target row by adding k * pivot row value to corresponding values in target row
    newTargetRow = []
    for i in range(len(m2[r1])):
        newTargetRow.append(m2[r1][i] + (k * m2[r2][i]))

		# Replace target row with new one
    m2[r1] = newTargetRow

    return m2

The comments in each function should explain the logic well, so read through them if you want to understand how the different functions work.

With the elementary operation functions set, I moved on to creating functions for forward and backward elimination.

# Forward Elimination
def FindRowEchelonForm(m: list[list[float]]):
    """
    Parameters:
    -> m: list[list[float]] is the matrix we want to transform to Row Echelon Form (REF)
    """

    """
    Here is full list of steps for the algorithm:
    1. Identify first pivot column
        - Start with left most column that contains a nonzero entry
        - Pivot is first nonzero # in that column (top-down)
        - If top entry = 0, swap with a row below that has a nonzero entry
    2. Make the pivot = 1 (optional for REF, but we'll do so here)
    3. Eliminate entries below the pivot
        - I can use my RowReplacement function for this
    4. Move to next pivot (one row down and one column to the right)
        - Repeat steps 1-3 for this submatrix (ignore rows above and columns to left)
    5. Stop when:
        - Each new pivot is to the right of the above one
        - All rows below pivots are zeros
        - All zero-rows (if any)are at the bottom
    """

    if not isValidMatrix(m):
        raise ValueError("Invalid Matrix Dimension")

    numrows = len(m)
    numcols = len(m[0])
    currPivot = 0
    currCol = 0
    currRow = 0
    
    OutputMatrix = m
    PivotTracker = {} # each item will look like --> PivotNum:ColNum
    """
    How Loop Should Work:
    1. If current checkpoint (a row/column) is 0, swap with the first row below having a non-zero value in same column
    2. If all rows have 0 for that column, it's a free variable, so move to the next column (but keep row the same)
    3. Scale pivot row to make the pivot 1
    4. Eliminate values in same column below pivot (make them 0)
    5. Repeat until done
        a. We know we're done when we run out of rows or columns
    """

    while currRow < numrows and currCol < numcols:
        # 1. Check current (currRow,currCol) pair to see if it's a pivot
        if OutputMatrix[currRow][currCol] == 0:
            # Iterate through all rows below to find a non-zero one to swap
            for i in range(currRow+1, numrows):
                if OutputMatrix[i][currCol] != 0:
                    OutputMatrix = RowSwap(OutputMatrix,currRow,i)
                    break
                else:
                    continue
            
            # If value is still 0, we have a free variable, so we move to next column
            if OutputMatrix[currRow][currCol] == 0:
                currCol += 1
                continue

        # Now assume we have a non-zero value, so we reduce column to have 1 as the pivot
        OutputMatrix = RowScale(OutputMatrix,currRow)

        # Now we save the Pivot Number and the row it exists in
        PivotTracker[currPivot] = currRow

        # Now we take all rows below pivot and Row Replace to make values under the current pivot all 0
        for i in range(currRow+1,numrows):
            OutputMatrix = RowReplacement(OutputMatrix,i,currRow)

        # Finally, we increment our variables
        currPivot += 1
        currRow += 1
        currCol += 1

    return OutputMatrix, PivotTracker

# Backward Elimination
def FindReducedRowEchelonForm(m: list[list[float]], p: dict[int:int]) -> list[list[float]]:
    """
    Parameters:
    -> m: list[list[float]] is the matrix we want to transform to Reduced Row Echelon Form (RREF)
        - We must input a matrix already in Row Echelon Form (REF)
    -> p: dict[int:int] represents all pivots with their corresponding column number in the REF matrix
        - So, each element in the dictionary has form -> PivotNum:ColNum
    """

    """
    Process for finding Reduced Row Echelon Form (Backward Elimination):
    --------------------------------------------------------------------
    1. Start at bottom pivot
    2. Use row replacement operations to make all entries above that pivot = 0
    3. Move upward pivot by pivot until all pivot columns haev zeros above and below pivot
    4. The pivots remain 1 (no scaling needed)
    """

    rref, pivots = [row for row in m], list(p.keys()) # Note: m is REF
    
    # Use reversed() to begin with bottom-most pivot
    for i in reversed(pivots):
        for r in range(i):
            rref = RowReplacement(m=rref, r1=r, r2=pivots[i])

    return rref

When combined into one script, the functions correctly solve systems of linear equations. However, there are two major caveats that make the script non-ideal.

The first is that FindReducedRowEchelonForm only returns a matrix in reduced row echelon form. It doesn’t specify the unique solution to a system. You need to review the last column in the output matrix and deduce the solution on your own.

The second caveat is that not all systems of linear equations have a unique solution. Sometimes, they have no solution or infinitely many solutions. Systems with infinitely many solutions have free variables (indicated by a zero row). You can deduce that your input system has infinitely many solutions on your own by searching for zero rows in the output.

Unfortunately, my script does not give you a formula that represents all possible solutions. Nor does it tell you if there are no solutions to a system.

Days after writing my script, I noticed improvements I could make to make the code useful and accurate for solving systems of linear equations.

Here are two of the things I would do differently…

The first is writing a more precise row replacement function. In RowReplacement, I added logic that reduces all pivots to 1. However, row echelon form does not require pivots to equal 1. I made this decision so that the backward elimination function would be simpler to write.

A proper row replacement function would force the target row’s pivot to equate 1. Instead, the RREF function would use row scaling to set pivots equal to 1.

The second thing I would do differently is change the input matrix parameter for FindRowEchelonForm and FindReducedRowEchelonForm . I made these functions take a matrix as input, but that matrix must be an augmented matrix.

When you solve systems of linear equations, you begin with an equation Ax = b. A represents a coefficient matrix, where each entry is a coefficient from your linear equations. x represents a column vector of unknowns, which is the solution you want to find. And b is a column vector of constants.

Using these objects, you create an augmented matrix by joining the coefficient matrix (A) and the column vector of constants (b). The last column of your augmented matrix represents the column vector, while the preceding columns come from your coefficient matrix.

My functions force you to input the augmented matrix, meaning you have to create it on your own. A better script would allow you to input the coefficient matrix and column vectors separately, then build the augmented matrix on its own.

Despite noticing these enhancements, I did not attempt to implement them.

Seeing that there are many ways to apply linear algebra to code, I decided to build a robust C++ tool that computes all vector and matrix operations you’d need from linear algebra. I slyly named the tool Claire (Custom Linear Algebra Implementation and Representation Engine).

There’s a lot to do for this project, and seeing as I’ve never coded in C++, there’s a lot to learn.

A Gaussian Elimination function for solving systems of linear equations will surely be included, so as I adapt my code from Python to C++, I will implement the enhancements I found.

More to come soon…