How knowing math helps you write better software
Projects Update
TLA+ workshop last week went really well! I'm super happy with the current state of the workshop and while I'll still be making a bunch of tweaks, it's now the most polished it's ever been.
I'll be taking this month off from running workshops to move apartments and dedicate more time to writing. I know I put a deadline for the first "Are We Really Engineers" essay for end of September, but I'd really like to have something mostly-coherent by the end of this month. Fingers crossed!
How knowing math helps you write better software
In my free time I've been trying to plug some gaps in my math background. Right now I'm learning graph theory, and after that plan to learn some probability and statistics. People often say that math is important for programming and that programmers should know more math. Computer science is a branch of math, after all!
Understanding mathematics is certainly important for understanding the broader context of computer science. But in terms of practical applications… most of the information out there is pretty unpersuasive. On one hand, you have people saying "learning mathematics will help you think better." But that's too fluffy and broad to be a good reason.1 On the other, you have people arguing that mathematics gives us a lot of core abstractions in programming, but never quite bridge the gap from that to "will help you."
So I like to justify why learning mathematics will make you, in your day-to-day life, write better software. Disclaimer: I'm going to be doing a lot of handwaving here and bring up very historically inaccurate examples. I call this style "Hillel is writing a newsletter and doesn't want to do research".
What does it mean to know math?
First, we need to talk about what we mean by "knowing math". When I think about knowing math, I think about it roughly in three distinct categories:
- Knowing how to use machinery to transform problems into solutions. This is things like arithmetic, calculus, the way that physicists and actuaries use math. The problem here is that we don't actually need all that much calculation. We have computers to do that for us!
- Understanding how to advance the field of math, like reading and writing proofs. This is what most mathematicians think of as "knowing math". Again, the problem here is that most of us don't need to write proofs. There are some domains that need that, but they already know that they need math.
- Understanding the outlines of what math has done and understanding how the real world relates to math. People don't think of knowing math in these terms, it feels too much like "space is pretty" popsci. This is also, to programmers, the most useful means of "knowing math".
Why do we have math, anyway?
Time for some extremely oversimplified handwaving. Math, very broadly, is developed for two purposes: to better understand reality, and to better understand the math we're using to better understand reality. These two motives constantly play off each other. An example:
- If I throw a rock straight up, roughly how long will it take to hit the ground?
- We can approximate its position with the equation
-gt²/2 + vt + h = 0
. Let's call this a quadratic equation. - We develop the quadratic formula to easily find solutions to that equation.
- We can generalize quadratics to cubic, quartic, quintic terms, giving us general polynomials. Are there corresponding formulas to get the roots of arbitrary polynomials?
- This ends up being way more complicated, so we start to discuss the abstract properties of polynomials, giving us algebraic rings.
- By studying rings as an abtract concept we develop theorems that, when applied back to polynomial rings, tells us that a general root-finding equation doesn't exist, but they do exist for special cases.
- A few of these special cases model optical lenses really well, helping us design better telescopes.
- With better telescopes, we find our existing models of celestial mechanics don't perfectly match up with reality.
- Time to develop better models...
Okay, historically it almost certainly did not develop this way, that's not the point. The core feedback loop is real: that continuous zigzagging between advancing our knowledge of the physical world and advancing our knowledge of the abstract world. It's seeing those bridges between those two worlds that makes "knowing math" so powerful. If you can see how your real-world problem is reducible to a mathematical construct, then you can use all of the abstract machinery we've developed to analyze that problem as the mathematical construct.
Let me give an example. This is a simple one I ran into that didn't actually use all that much mathematical knowledge, but it really elegantly shows that these bridges are useful.
Coin flips and die rolls
How do we model the probability distribution of rolling a six sided die if all we have is a fair coin? In other words, a process where, by flipping a coin enough times, we uniformly generate a random integer between 1 and 6. Donald Knuth studied this problem and came up with the following solution:
So HHT is equivalent to rolling 1, HTH is 2, etc.
Now there is an interesting problem with this solution: it's unbounded. Hypothetically you could keep flipping heads and never actually complete the algorithm. I had a friend who was working on modeling probabilistic algorithms and he wanted to optimize this away. He wanted an improved algorithm that was guaranteed to halt in a finite number of steps.
I had a bit more of a math background than him, so he asked me if there were any mathematical issues here. After thinking about it, I was able to prove it was impossible:
- Consider all of the flips as forming a tree. Then for N flips there will be 2^N sequences.
- To get equivalent probabilities we need to divide the set of sequences into equal-sized disjoint subsets.
This would require that there is some N
such that 6 divides 2^N (written
6 | 2^N
). - Since
6 = 2*3
, for6 | 2^N
we need3 | 2^N
. - By fundamental theorem of arithmetic 2^N will always have the same prime factors as 2. 2^N will never be threeven.
- So there is no way to divide up the sequences evenly, no matter how many flips you have.
Again, not a particularly complex proof. But the trick isn't writing the proof, it's realizing that the problem of "can we find a finite algorithm" is representable by a construct in number theory. That zigzag, seeing how the real-world problem connects to mathematics, is what makes knowing math so useful.
Writing better software
How does this help us day-to-day? It probably won't do all that much while you're writing tests or doing code review or designing out the json schema. The benefit of knowing math comes up only occasionally, but the payoff is huge. We're writing a program to do something. That something is embedded in a larger real-world domain, a push-pull of problem and solution. There is likely a connection between that problem and a mathematical abstraction. If you can identify the corresponding abstraction, you can leverage what we mathematically know about those abstractions to radically simplify your problem. If you're lucky, you can even reduce it to an already solved problem.
Imagine you made a competitive game like, I dunno, Splatoon, and want to match players of equal skill against each other. You break them into ranks and come up with a bunch of rules about how people go up and down ranks based on games they win or lose. Does your system place players with similar win/loss ratios against each other? Here's three ways you could figure this out:
- Guess, get it wrong.
- Come up with your own bespoke metric for evaluating fairness and hope it's not too laborious to use, get it wrong again.
- Realize that you can represent people's change in score as rows in a stochastic matrix. Read the wikipedia article for stochastic matrices. Learn that eigenvectors of the matrix represent steady states. Google
numpy eigenvector
. Done.
Did you "use" math in the way that mathematicians or physicists use it? Arguably no. Did knowing math help you avoid rolling two buggy systems out to millions of players? I'd say so.
The book I'm using to learn graph theory, the Wilson book, is really good at emphasizing this. Every chapter ends with a selection of real world problems that don't look like they have anything to do with graphs, but are elegantly modeled with them. One example is Hall's theorem: if we have a given set of talks and a given set of slots in the schedule, where each talk can only go in a subset of the slots, can we successfully pair up the talks and slots? This problem can be represented as a bipartite graph, which gives us necessary-and-sufficient conditions for there being a solution. That's really cool!
The Kinds of Math We Need
I want to emphasize the difference between "math as knowledge" and "math as a skill". There's the pure math skill of being able to prove that the eigenvector of a stochastic matrix represents a steady state. There's the applied math skill of being able to find the eigenvector by hand, or transform the matrix to make it more amenable to computation. For us, those skills aren't nearly as important. It's better to know more kinds of things that can be used as abstractions, properties of those abstractions, and what transformations are possible than it is to actually prove those properties or make those transformations.
(There's still a skill here: being able to faithfully translate between the world and the abstraction. Recognizing a problem is represented by a stochastic matrix is useless if you can't actually write the corresponding stochastic matrix.)
This means that different fields of math are useful to us than are useful to mathematicians and physicists. We probably don't need to know much proof theory or real analysis. We probably won't need to know techniques to find partial differential equations or use conformal maps.2 There are also a lot of fields in math that aren't as universally-used in mathematics but have a lot of connections to reality, and so are useful at representing a lot of problems:
- Combinatorics
- Probability
- LINEAR ALGEBRA
- Graph theory
- Statistics
And plus a ton of other smaller things, like automata, queueing theory, iterated games... a lot of these are incredibly powerful, but usually neglected when people talk about "using math" in CS. I think that's because we're basing "using math" on how math and math-adjacent fields use it, as opposed to what would be most helpful for us.
Bonus: De Morgan's Paradoxes
Wanna leave things on a lighter note, so here's a book I've been tearing through: A Budget of Paradoxes, by Augustus De Morgan. It's a fun romp through the history of heterodox opinions in the sciences, with special attention paid to crackpots with heterodox opinions. De Morgan is Not Having It with the crackpots and his weary frustration is delightful.
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.