Implementing frame analysis via the stiffness method should be a rite of passage for every student of structural engineering. As an undergraduate, I did an implementation in C++. I assembled the stiffness matrix and load vector, solved for nodal displacements, then assembled reactions—the FEA three-step. I don’t remember all the other details, but I am certain the professor gave us the algorithm for dense matrix Cholesky decomposition in order to solve Ax=b. The algorithm is implementable in only a few dozen lines of code. Cholesky decomposition is simple and gets the job done.
Now, as a professor, I recently implemented nonlinear frame analysis for a personal project requiring C++. For nonlinear analysis, you’re basically putting Ax=b inside a double-nested loop—time stepping is the outer loop while Newton’s method is the inner loop. Both Newton’s method and time stepping are easy to implement. Inside a double nested loop, dense matrix Cholesky decomposition is fine for small models, say a hundred equations. But, as you move up to moderately sized models, on the order of thousands of equations, the time spent solving Ax=b adds up fast and the dense matrix Cholesky decomposition becomes a bottleneck.
The issue is sparsity, where most of the matrix entries will be zero. Based on element connectivity in a model, you know a priori where those zeros will be. Dense matrix Cholesky decomposition plows through those zeros performing unnecessary zero ops (add zero, multiply by zero). Moderate to large finite element models tend to produce very sparse matrices, with well over 90% of the entries being zero.
It turns out writing your own sparse matrix solver is not so straightforward. While operating on matrices in compressed column format is doable, elimination trees, equation re-ordering, and reducing fill-in are conceptually difficult.
Sure, I can download and link to C++ libraries for fast sparse solvers, libraries such as PETSc, Eigen3, and Trilinos. Those libraries have many bells and whistles for speed and parallel processing, which is essential for nonlinear analysis of very large models with hundreds of thousands of equations.
But for my personal project, I just want a simple sparse matrix solver, something that doesn’t suck, something that can handle a few thousand equations without having to worry about all the bells and whistles. So, why not write my own simple sparse solver? I don’t care if my sparse solver is big, fat, and slow relative to PETSc and the like, it just has to be much faster than dense Cholesky.
Fortunately, the book Direct Methods for Sparse Linear Systems by Timothy Davis explains how to build a sparse matrix solver from zero with sample code in C.
Being fluent in C++, reading Davis’s C code is like reading Early Modern English, i.e., Shakespearean English—you recognize the words and understand the meaning, but you have to convert a lot. Some conversions are easy, like rewriting malloc
and free
as new
and delete
. Other conversions require some interpretation, like which functions should be implemented as constructors and which functions should be public and which should be private. As a bonus, Davis uses a ton of ternary conditional ?:
operators, even as array indices, which I love.
Regardless of how you interpret Davis’s C code and translate to your programming language of choice, be it C++, Rust, or whatever, writing your own sparse matrix solver is an excellent way to learn data structures, algorithms, and memory management.
Something deeply satisfying about developing your own solutions to technical issues you’ve long ago mastered.