Halfspacehttps://buttondown.com/j2kunI write technical articles about math and programming at jeremykun.com.
This newsletter will contain more personal reflections, ideas I'm chewing on, and thoughts on aspects of mathematics and programming I'm exploring. It's the "half" of my space of thoughts that precedes and informs my work and life.
I hope it will also be a chance to connect more with readers, in a healthier way than social media permits.en-usFri, 16 Aug 2024 14:00:00 +0000Decision Logshttps://buttondown.com/j2kun/archive/decision-logs/<p>I haven’t been writing much here recently, mostly because (a) I have a new infant at home, and (b) I’ve been spending much of my free time migrating my blog to hugo (<a href="https://www.jeremykun.com/" target="_blank">which is now live</a>), setting up POSSE-style content syndication on social media (<a href="https://www.jeremykun.com/shortform/2024-08-07-1414/" target="_blank">see here</a>, and pushing a handful of technical blog articles past the finish line, including:</p>
<ul>
<li><a href="https://www.jeremykun.com/2024/05/04/fhe-overview/" target="_blank">An FHE overview</a></li>
<li><a href="https://www.jeremykun.com/fhe-in-production/" target="_blank">A catalog of FHE uses in production</a></li>
<li><a href="https://www.jeremykun.com/2024/08/04/mlir-pdll/" target="_blank">A tutorial on using PDLL in MLIR</a></li>
</ul>
<p>So the softer material that I like to write about here hasn't been flowing much.</p>
<p>That said, Justin Duke, who runs Buttondown which powers this newsletter, <a href="https://bsky.app/profile/jmduke.com/post/3kzrxokdkil2t" target="_blank">recently wrote on Bluesky</a> about the idea of a decision log: keep track of the decisions you make and why.</p>
<p>I independently had this idea when I worked in Google's supply chain world. I called it the "policy change log," since it was primarily there to keep track of requests for business logic changes that came from outside our org. Small stuff like, "make sure this type of RAM is preferred over this other type of RAM for these kinds of machines."</p>
<p>These requests often came over private chats or in small meetings between my team and one stakeholder. They were somewhat necessary, in the sense that the automated policy management systems we had in place were not expressive enough to support their requirement.</p>
<p>After a year or two of fielding these requests, I started to see that the people asking for the features were sometimes wrong about their motivation (they believed something that was not true, e.g., something cost more than another thing) or they simply didn't have all the information, and that the person who did wasn't around to set them straight. It might not be discovered until months later that our automation was doing the wrong thing, and the policy would have to be reversed.</p>
<p>So I instituted an informal policy for our team, and made sure my engineers were on board: don't make random out-of-band business rule changes without a 1-page doc explaining the change and why it would be made. I.e., these are changes requested without the sorts of docs and prioritization that accompanied month+ long projects.</p>
<p>This was relatively small amount of extra fuss. It took maybe one hour to write up the doc and get it approved by the person asking for it. Then we'd send out a short email to common stakeholders saying "by the way, we're going to do this if nobody tells us not to." And then I'd post a link to the doc with a 1-line summary in a markdown table on our team's internal website (the "policy change log").</p>
<p>This gave us a few superpowers relative to the added fuss. It gave us proof and receipts when the requested policies led to bad outcomes. As a consequence of being hidden behind a complex automated system, our software was blamed by leadership as unreliable. With the log, when someone complained, we could point to the doc and say, "Go talk to so and so who asked for and approved this quirky behavior." </p>
<p>While saying "we need to write a 1 page doc" was sometimes enough to make the person drop the request entirely, it also gave us a lot of ammo to say "no" to repeated requests—new management comes in and doesn't remember the failures of the past—or near-duplicate requests that repeat similar mistakes. It's extremely powerful to point to two docs, written a year apart, the first of which has a poorly-justified policy change, and the second of which links to a postmortem or list of bugs, and say, "your proposal smells like this, figure out a way to avoid it, or else you better quantify how much money it's going to save us." While they did that, we had more time to do what we saw as our more important work.</p>
<p>The policy change log also helped PMs see what the hell was going on with our software. It shone light on a chaotic organizational system, in which our team's software was at the center of many competing priorities. In an internal talk, I used an image of a sack of money in the center of a 4-way tug of war, with each department labeled at the end of each rope. They all claimed their changes were necessary to save Google money, but it turned out most of them were mainly optimizing for their own metrics, and sometimes doing that gave net negative savings. This is not a new revelation by any means, but having the policy change log gave me sufficient evidence to justify broader projects that tried to navigate the competing incentives. Perhaps I can write about the engineering work I did on that in a future newsletter.</p>
<p>I had about 25 entries in the log—most of which I wrote myself—before leaving the org to work on cryptography compilers. I checked today, and it seems my team did not continue adding to the the policy change log after I left. As I try to get my fussy toddler to say whenever something doesn't go his way, <em>c'est la vie</em>.</p>Fri, 16 Aug 2024 14:00:00 +0000https://buttondown.com/j2kun/archive/decision-logs/Technology as Power Transferhttps://buttondown.com/j2kun/archive/technology-as-power-transfer/<p>Though I've forsaken cryptocurrency and blockchain
in my personal and professional life,
I currently work in homomorphic encryption.
Both areas have fertile soil for new ideas in cryptography,
and so the blockchain world remains
in my periphery.
As much as I avoid the topic, I can't avoid people in my field who work on it.</p>
<p>A mild silver lining of that reality
is that <em>most</em> of the people I interact with who work in blockchain
appear to be in it for the math,
and put up with the blockchain parts for funding.
To be clear, that crosses an ethical line for me personally—and I've
been thinking a lot about how to express my thoughts about this
that is both meaningfully persuasive and intellectually honest
in the essay that will close
<a href="https://pmfpbook.org" target="_blank"><em>Practical Math for Programmers</em></a>.
But shaming them about not meeting my ethical standards
also doesn't feel productive.</p>
<p>And occasionally, I find people who have a similar persuasion to me,
and in an appropriately private setting
we sometimes discuss how one might frame the problems of Bitcoin and blockchain
to the non-cryptographers in their lives who have bought into the hype—and for whom recent events have not convinced them otherwise.</p>
<p>There are some basic empirical contradictions.
Bitcoin can't replace cash as a medium of exchange
because there are so few legal things you can actually
do with it besides speculate on its price.
It's a poor store of value because of its wild price fluctuations.
It's so expensive and slow to confirm transactions that
actually processing a decent fraction of daily purchasing volume
on the blockchain would grind consumer finance to a halt.
Bitcoin mining rapidly centralized, defeating its purpose
and threatening its security model.
Bitcoin has also turned out to be highly non-private,
rife with scams, politically slow to adapt,
and a massive energy hog.
But proponents counter (at least with consistency)
that a future world in which decentralized currency succeeds would somehow magically
fix all these problems,
and shouldn't we all be dreaming for a better system than the one we currently have?</p>
<p>I don't think anything besides getting bankrupt
will convince a "number go up" cryptocurrency enthusiast to act differently,
and so technical or observational arguments above will be ignored
as long as the roller coaster continues to provide thrills.
But an appeal to values
might persuade those who understand the high-level criticisms,
but still think cryptocurrency will make the world a better place.
These people mistrust governments,
especially the United States government,
and for good reason.
They would prefer a world in which individuals
had more control over their own destinies,
whether that means freedom of movement,
freedom to seek economic prosperity,
freedom from rent-seeking,
"trust-free" transactions, etc.
Often this boils down to specific ways they felt they've been wronged
by particular laws and regulations—some more extreme than others—or else
it is driven by a fuzzier sort of attraction
to an idealized proto-American individualist lifestyle
in which people can invent and innovate unencumbered.
(Pro tip: a history lesson here does not seem to help.)</p>
<p>To these people I would say:
if it succeeded, cryptocurrency would represent a transfer of power.
Specifically, it would transfer power <em>to technologically savvy people</em>
and away from everyone else.
So if you dream of a world with cryptocurrency replacing fiat currency,
you need to ask yourself whether you have the technical skills required
to understand and use cryptocurrency safely,
and the time required to exercise those skills in your day to day life.</p>
<p>Let's make it more concrete.
In traditional consumer finance today,
you spend some basic amount of time
trying to determine if some sunglasses you're buying online
are counterfeit or a scam (thanks, Amazon).
But this is mostly to avoid the hassle.
You can get a refund but it takes effort to do so,
and then you still have to find another pair of sunglasses to buy.
You don't particularly worry
that by buying these sunglasses,
attackers will drain your bank account,
because the banks provide protections
and credit cards make chargebacks relatively easy for consumers.</p>
<p>In a world where everything happens on the blockchain,
where the wild west is reality,
all of the work of vetting the source
falls on each individual consumer.
In the worst case, you have to read the code
of a smart contract, and hope that someone else
doesn't find a security flaw you didn't spot.
If you have the time and ability to do that,
then you're probably going to be safe.
You won't lose your wallet's private keys,
you won't fall for scams,
and you'll reap the benefit of a regulation-free economy.
Everyone else will just have to shift their trust
to some other person or institution that has these skills.</p>
<p>My father, for example,
asked me in 2018 or so
to buy him some Ethereum.
I am well aware of the joy he takes in stock trading
and his "number go up" mentality,
so I said no, not if he's going to get into the habit
of short-term price speculation.
Then he tried to convince me of two things.
First, that he really, truly believed in the technical merit
of the platform and its long-term prospects for society.
He was apparently an avid reader of Ethereum news
(god only knows what blogs he was reading).
Second, he promised me that he would not touch the money for 15 years.
I didn't believe him,
but thankfully I have enough technical knowledge,
so I said, "only if we do it my way."
My way was to buy the ETH on Coinbase,
send it to a wallet not managed by Coinbase,
write the wallet's secret key on a piece of paper,
delete the key from my computer,
and put it in my parents' lock box in a small-town bank branch
that happened to be far away from where they lived at the time.
It was effectively impossible for my father to access this money,
because, while he physically had access to the secret key,
he would have never been able to restore the wallet
without me or someone like me who knew how to get it back into Coinbase. This would force him to hold to his "15 year" promise. <sup id="fnref:scared"><a class="footnote-ref" href="#fn:scared">1</a></sup></p>
<p>Even if I hasn't gone to such extreme measures
to try to keep him honest—and to try to keep him from
margin trading or getting scammed—it was clear
that he <em>needed</em> me to achieve his goals in this new world.
My ability to make the transactions safely for him,
and even my hypothetical willingness
to teach him how to do it on his own,
was an embodiment of that power transfer.</p>
<p>The story ended much earlier than 15 years.
About a year later,
my father convinced me to take the ETH out of cold storage
so he could cash out completely.
His reason was stated as his impatience with the rate of progress
of the promised features of Ethereum,
and a loss of confidence in the project.
In a way I didn't understand, that was somehow tied to
the price of ETH, which I guess he expected to rise
with every new feature.
I relented, I cashed him out, and then a few things happened.
First, he made a small profit.
Second, he was irate with the fees Coinbase charged
to take USD off the platform.
He called it a "scam",
but I'm just happy that was the worst of what he experienced.
Third, their taxes were a huge headache that year.
And finally, the price of Ethereum grew to be much higher.</p>
<p>The "number go up" folks will say, "See?" but this isn't for them.
That post-cashout experience underscores my point:
he thought he was well-versed in the technical merit of Ethereum,
and how those technical details related to its price—or at least
its worthiness of investment.
But he was wrong.
Based on some small clues he inadvertently let slip,
I suspect what happened was that he was swayed
by other new cryptocurrency projects he was reading about
that were inspired by Ethereum and able to move faster,
and he was planning to pivot to reinvest his proceeds in Monero or whatever,
but then the huge exchange fees were so distasteful
that he felt standard stock trading would cut in less to his profits.<sup id="fnref:tesla"><a class="footnote-ref" href="#fn:tesla">2</a></sup>
So he just happened to be wrong in enough ways to cancel out
and at least get his money back, if not his time, with a small, lucky profit.</p>
<p>For all that my dad lacks in cryptocurrency knowledge,
he is still quite smart and technically minded.
Over his career he designed and built planes, factories,
and medical devices used in optometrists' offices and veterinary sciences.
I'm sure he has the skills required to learn how to use cryptocurrency safely
if he had the time necessary to understand the technical aspects.
But he doesn't.
And what does that say for the rest of the population?
Working mothers, teenagers, bus drivers,
and all the people who don't care about the technology
and just want to buy their pizza or watch a movie.
If they use digital currency, they will simply
be transferring their trust
from a government-backed bank
to a service provider with shakier safety measures.
And who will be running those service-providers?
We nerds who actually understand the technical parts.
Maybe some people out there still think
our current tech moguls like Mark Zuckerberg
and Jeff Bezos are blessed innovators
and the best hope for society's progress
is to give then more power over our personal lives.
Maybe they'd be happy to trust them
to control all of society's money,
while only the nerds can opt out to manage their own wallets.
But for most people,
framing cryptocurrency as a choice
between letting well-regulated, government-insured banks manage our money
and letting the next Jeff Bezos do what he wants with it,
should make the best choice obvious.</p>
<p>I have more to say about how this framing—technology
as power transfer—relates to other tech trends like LLMs,
but I could not fit my thoughts in the margins.
Next time.</p>
<p><strong>In cast you missed it</strong>: <a href="https://aprilcools.club" target="_blank">April Cools</a> was last week and a handful of us bloggers wrote earnestly about something they don't normally write about. Mine was <a href="https://www.jeremykun.com/2024/04/01/unusual-tips-for-parenting-toddlers/" target="_blank">parenting</a>. Maybe join us next year?</p>
<div class="footnote">
<hr />
<ol>
<li id="fn:scared">
<p>In fact, he was so nervous about actually committing to buying the ETH that he made me click the "buy" button on Coinbase for him. <a class="footnote-backref" href="#fnref:scared" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:tesla">
<p>Since then he has made a small fortune investing in Tesla, and fawns over Elon Musk far more often than I would like. <a class="footnote-backref" href="#fnref:tesla" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
</ol>
</div>Tue, 09 Apr 2024 13:00:00 +0000https://buttondown.com/j2kun/archive/technology-as-power-transfer/Programming Jigshttps://buttondown.com/j2kun/archive/programming-jigs/<p>In woodworking there's the concept of a <a href="https://en.wikipedia.org/wiki/Jig_(tool)" target="_blank"><em>jig</em></a>, which is a sort of ad hoc stencil for woodworking projects. You might fashion a jig by screwing two wood blocks together in such a way that one protrudes a specific distance from the first, so that you can use them as backstops to hold two workpieces against each other at specific distances while you fasten them together. Or you might make a jig with precisely measured holes, and then use that to quickly drill holes in the right locations on the boards that you'll use for your workpiece.</p>
<p><img alt="an example of a jig" class="newsletter-image" src="https://assets.buttondown.email/images/6691ec3b-4b23-428b-acf6-c5890a2e655b.png?w=960&fit=max" /> </p>
<p>Jigs are often used when making multiple pieces of the same design, but they are also useful when working with a project that has symmetry (e.g., a box with four sides that need identical joinery). While Googling "woodworking jig" will pull up an array of indie woodworkers trying to sell you products designed to replace custom-built jigs, there's a spirit among the woodworking community of quickly throwing together your own jig from scrap wood in your shop.</p>
<p>Fashioning a jig is akin to a snippet system in a text editor, a templating engine like jinja2, or a custom script to generate boilerplate code for a project. I thought about this relationship most recently because, in building a cryptographic compiler based on the MLIR compiler toolchain, I found most people who contribute are not going to want to spend the time learning the minute details required to properly set up the boilerplate for a new dialect or compiler pass. So I wrote <a href="https://github.com/google/heir/tree/main/templates" target="_blank">a set of scripts</a> that mostly sets it up for them.</p>
<p>I've done this sort of thing a lot in my career; configuring my snippets to generate Java <a href="https://github.com/google/auto/blob/main/value/userguide/index.md" target="_blank">AutoValues</a> fast, setting up a hygenic vim macro or regular expression to progressively apply across dozens of files (which I then save in a README or PR/commit description for later reference), and writing ad hoc scripts to generate boilerplate code. I find it curious that most programmers I have worked with don't seem to do the same thing, whereas <em>every</em> woodworker I have ever talked to will quickly suggest making a jig for any project that requires repeating a task mildly precisely across more than one workpiece.</p>
<p>Instead what most programmers lean toward is re-engineering the entire system so as to avoid the need to write boilerplate entirely. And while, yes, this makes sense when you squint, in my experience this causes an explosion of complexity and a long time horizon to achieve the desired result. Some projects go so far as to build a new programming language, or at least propose changes to the language specification, just to make their boilerplate avoidance template metaprogramming wizardry possible. I heard at one point that the Django web framework did this to some degree to Python, since it relies heavily on advanced Python magic to work, but I can't find the reference right now.</p>
<p>And I don't think that boilerplate is inherently evil, at least insofar as it makes certain useful implicit or unobvious semantics explicit. For example, during my first serious attempt to write Go code, I was appalled to learn that they use capitalization to make functions and variables publicly visible from outside a module. Presumably because the authors just didn't want to type "public" or "export." In my opinion it hinders a system's coherence to have too much rely on convention merely to avoid a few keystrokes.</p>
<p>But more to the point, what I love about my "programming jigs" is how I can assemble them from whatever software tools I have lying around my terminal; sed, awk, jinja2, the python standard library, UltiSnippets, vim macros, etc. Even if the jig is only useful for one day or one PR, the more jigs I make, the quicker I get at identifying what sort of jig will solve my immediate task.</p>Sun, 10 Mar 2024 16:00:00 +0000https://buttondown.com/j2kun/archive/programming-jigs/Weak and Strong Algebraic Structureshttps://buttondown.com/j2kun/archive/weak-and-strong-algebraic-structures/<h1>Weak and Strong Algebraic Structures</h1>
<p>Hillel Wayne's <a href="https://buttondown.email/hillelwayne/archive/why-all-is-true-prod-is-1-etc/" target="_blank">recent newsletter</a> on monoids got me thinking again about an old question of mine: why are some algebraic structures more useful than others in software applications?</p>
<p>A bit counter-intuitively, a monoid is one of my examples of an algebraic structure that is <em>not</em> all that useful. A monoid is one of the simplest examples of an algebraic structure that programmers have a reason to talk about. In the context of implementing <code>fold</code>, aka <code>reduce</code>, in functional programming languages, if you recognize your input type as a monoid, then you can use a pre-existing <code>reduce</code> operator, and be confident it will work. One less <code>for</code> loop to write.</p>
<p>Sadly, for monoids the story seems to end there. And this is where my curiosity comes in. If, by contrast, you recognized your data forms a vector space with a dot product (or you model it as one), then you open the door to a wealth of algorithms, data analysis techniques, and interpretations that you can bring to bear on a problem. You can compute various bases to represent your data efficiently, make sparse decompositions, project onto subspaces, compute distances, and study eigenvalues and eigenvectors.</p>
<p>Where are the useful algorithms for monoids? Where are the structure theorems? Surely there are some; an entire subfield of mathematics is devoted to monoid theory. But I haven't seen anything resembling a novel theorem about monoids used in a practical setting (please tell me if you know of one!).<sup id="fnref:trace"><a class="footnote-ref" href="#fn:trace">1</a></sup></p>
<p>My opinion on this has two parts. First, very few things are <em>just</em> monoids. Second, a monoid is just too weak of a structure to do anything really novel with.</p>
<p>For the first, consider the <a href="https://en.wikipedia.org/wiki/Monoid#Examples" target="_blank">list of examples of monoids on Wikipedia</a>. Out of these 19, there are only two examples that I'd consider "proper" monoids in that they have monoid structure but nothing more: the set of all finite strings over a given alphabet, and the set of all functions from a set to itself. The set of strings is a "free" monoid, meaning it is the unique monoid with the least possible extra structure.</p>
<p>All the others are either non-examples (constructing a new monoid from an existing monoid) or describe sets that have more structure. E.g., </p>
<ul>
<li>Every group/ring/field is a monoid</li>
<li>The subsets of a set form a monoid, but they have the stronger structure of a lattice</li>
<li>Functions into a structured set form a monoid, but usually more when the target of the function has more structure (e.g., the set of functions <code>X -> R</code> is a ring when <code>R</code> is a ring).</li>
<li>The natural numbers (which form a monoid with zero as the additive identity) are usually better studied as integers, or modular integers, both of which form a ring. </li>
<li>Sets with set union/intersection/complements are more structured than a monoid: they form a structure called a boolean algebra.</li>
</ul>
<p>While they are monoids, most of these objects are not interesting for their monoid structure alone. The useful things about modular integers come from their group/ring/field-theoretic aspects, not their monoid structure.</p>
<p>For the second point, that a monoid is too weak, let me explain a bit about what I mean by "weak" and "strong."<sup id="fnref:weak"><a class="footnote-ref" href="#fn:weak">2</a></sup> A "strong" structure admits constructive characterization theorems, and efficient algorithms to convert objects into various canonical forms. For example:</p>
<ul>
<li>Linear algebra: Gaussian elimination, the Jordan canonical form, QR decompositions, etc., which convert a matrix into a form that allows useful information to be trivially read from the canonical form (e.g., invertibility dimension, eigenvalues/eigenvectors, explicit dependences between rows). </li>
<li>Graph theory: search algorithms, component decomposition, tree balancing, subgraph isomorphism (commonly used in computational chemistr; I'm working on a chapter of <a href="https://pmfpbook.org" target="_blank">Practical Math</a> on this very topic). </li>
<li>Ring theory: residue number systems, the Euclidean algorithm, etc. </li>
<li>Groups: the classification theorem of finite Abelian groups, <a href="https://www.google.com/books/edition/Permutation_Groups/1SPjBwAAQBAJ?hl=en" target="_blank">algorithms on permutation groups</a>.</li>
</ul>
<p>These types of structures are useful because they give you new constructive things to latch an application on to. A graph, for example, has local structure in terms of node degree and subgraphs, along with global structure in terms of its component structure, degree distribution, and eigenvalue spectrum. Fields/rings let you factor things, compute inverses and GCDs, etc.</p>
<p>But what can you do with a monoid? Saying "you can <code>reduce</code> it!" is saying little more than, "it is a monoid." When you reduce it, you don't actually use the monoid for anything more than its axiomatic definition, i.e., it's API. This is what I mean by "weak."</p>
<p>Another good example of a weak algebraic structure is a semilattice. <a href="https://clang.llvm.org/docs/DataFlowAnalysisIntro.html" target="_blank">Dataflow analysis in a compiler</a> represents the data flowing through a program as a join-semilattice, where the join operation represents how to combine two static estimates of a variable's value at a point where two branches of control flow join together. E.g., if you're estimating the possible range of an integer, you'd take a union of the two estimated ranges. It's a really neat idea, but if you look at the actual <a href="https://en.wikipedia.org/wiki/Data-flow_analysis#The_work_list_approach" target="_blank">implementations</a> of data flow analysis algorithms, they don't analyze the lattice in any meaningful way, they process a worklist of program points that need updating, and join lattice elements greedily. Like <code>reduce</code> for monoids, it uses the algebraic "structure" not for any true structure, but for its API and local guarantees about it.</p>
<p>And this makes sense because it's not the lattice structure of the <em>type</em> that admits useful optimizations. Its the structure of the <em>program</em> being analyzed, like its particular combinations of control flow and branching, that matter. Similarly, a generic algorithm that processes lists isn't going to use any nontrivial monoid structure of its input types, but rather the special structure of the list instance given to it, such as known-to-be-sorted, or known-to-be-uniqued etc. That's not to say that having a generic <code>reduce</code> isn't <em>nice</em>; it is! It's just nice in its cleanliness, rather than allowing you to do something you couldn't do before.</p>
<p>Even <em>groups</em>, which I implied above are "strong," are really too weak to be useful in the same sense as linear algebra and rings/fields. Groups are mostly<sup id="fnref:groups"><a class="footnote-ref" href="#fn:groups">3</a></sup> useful in computer science <em>because</em> they're weak enough that you can't easily crack cryptography based on them, but strong enough that you can define efficient encryption/decryption algorithms for them. If you set up a cryptosystem using the wrong group or the wrong public parameters, then the group has too much structure, becomes too easy to analyze, and the underlying security problem can be cracked.</p>
<p>Groups of numbers (or symmetries, or points on an elliptic curve) also have this nice property that they are compact. In a monoid like a string or a list, when you combine two elements <code>z = xy</code>, you explicitly preserve the underlying elements <code>x</code> and <code>y</code> (though the dividing line is lost). This isn't forced by the definition of a monoid, but I find it curious that the "nontrivial" examples of monoids-but-not-more are not compact in the same way as rings of numbers. Maybe monoids are weak because common monoids (that aren't spiritually more structured) are necessarily explicit, and hence the structure is already present in any representation as data, so there's no "new" structure to discover by an analogous kind of "change of basis" that you would find in linear algebra.</p>
<p>Other "weak" algebraic structures that I've never seen novel uses of: <a href="https://en.wikipedia.org/wiki/Magma_(algebra)" target="_blank">magma</a>, <a href="https://en.wikipedia.org/wiki/Groupoid" target="_blank">groupoid</a>, <a href="https://en.wikipedia.org/wiki/Racks_and_quandles" target="_blank">quandle</a> (a sort of "group" that only has conjugation), and basically every structure listed in <a href="https://en.wikipedia.org/wiki/Quasigroup#/media/File:Magma_to_group4.svg" target="_blank">this diagram</a> before "group."</p>
<div class="footnote">
<hr />
<ol>
<li id="fn:trace">
<p>I did read about <a href="https://en.wikipedia.org/wiki/Trace_monoid" target="_blank">"trace monoids"</a> recently, which are apparently equivalent to dependency graphs. Wikipedia states: "Traces are used in theories of concurrent computation, where commuting letters stand for portions of a job that can execute independently of one another, while non-commuting letters stand for locks, synchronization points or thread joins....The utility of trace monoids comes from the fact that they are isomorphic to the monoid of dependency graphs; thus allowing algebraic techniques to be applied to graphs, and vice versa." However, I can't find any examples of these monoid-algebraic techniques applied to graphs. Most references seem to be unavailable, or quite old, which makes me suspect that people who study concurrent systems don't think about them as monoids. <a class="footnote-backref" href="#fnref:trace" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:weak">
<p>I don't like the terms "strong" and "weak" but I can't think of any better ones right now. I don't mean weak as in "bad," I mean weak as in "less rigid in constraining how the thing behaves." But somehow "rigid" and "loose" doesn't feel right, nor "strict" and "lenient." <a class="footnote-backref" href="#fnref:weak" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
<li id="fn:groups">
<p>I am always interested in more constructive uses of group theory for practical programmers, but I have found very little. For some good examples, see this article on <a href="https://jeremykun.com/2023/07/10/twos-complement-and-group-theory/" target="_blank">two's complement</a> and this article on <a href="https://jeremykun.com/2021/10/14/group-actions-and-hashing-unordered-multisets/" target="_blank">hashing unordered multisets</a>, though both of those reinforce my point in that they rely on the groups being commutative, which in group theory entails a <em>dramatically</em> stronger structure, with the characterization theorems and efficient algorithms to back it up. <a class="footnote-backref" href="#fnref:groups" title="Jump back to footnote 3 in the text">↩</a></p>
</li>
</ol>
</div>Thu, 11 Jan 2024 19:14:25 +0000https://buttondown.com/j2kun/archive/weak-and-strong-algebraic-structures/NP-hard does not mean easyhttps://buttondown.com/j2kun/archive/np-hard-does-not-mean-easy/<p>Recently the internet resurfaced
my 2017 article, <a href="https://jeremykun.com/2017/12/29/np-hard-does-not-mean-hard/" target="_blank">"NP-hard does not mean hard"</a>.
I wrote the article mainly
to express the nuance that NP-hardness
only models the worst case of a problem,
not the average case under any particular distribution—i.e.,
the instances you happen to encounter in the real world.
More specifically, being NP-hard means
that a problem has sufficient <em>expressive power</em>
to model arbitrary boolean logic.
But you can't blame NP-hardness
for why you're bad at Super Mario.</p>
<p>One commenter made a remark
that I've seen repeated elsewhere
in various forms, that NP-hard problems
like SAT and traveling salesman
are "basically solved,"
due to the quality of modern SAT solving techniques
<a href="https://en.wikipedia.org/wiki/Conflict-driven_clause_learning" target="_blank">CDCL</a>,
modern integer linear programming (ILP) solvers like Gurobi,
and, I guess, decent heuristics for traveling salesman problems (I find the focus on TSP strange because I haven't yet heard of anyone who really cares about solving TSP in practice, compared to packing and scheduling problems which are everywhere).</p>
<p>Based on my experience, both directly at Google
and indirectly through research and interviews
I've conducted for my next book,
<em><a href="http://pmfpbook.org/" target="_blank">Practical Math for Programmers</a></em>,
this is far from the case.
Though all these tools that solve NP-hard problems are good,
everyone who actually uses them
is acutely aware of the underlying difficulty.
we can hardly call the problems themselves "solved."</p>
<p>I used ILP solvers at Google to solve massive
optimization problems.
Or rather, we <em>tried</em> to make them massive,
packing as much as we could into a single model.
But we quickly hit the limits
of what the world's best solvers could handle
in a decent time frame (say, 12 hours).</p>
<p>A big part of the problem with my work at Google
included numerical stability and small numbers.
As you might know, an ILP solver is minimizing or maximizing a linear objective—in
my case cost to Google—subject to a set of linear constraints.
My constraints collectively expressed
many of the constraints of building and managing computers in datacenters.
Originally we hoped to build one massive ILP to solve it all,
but it just not realistic.
For one, the decisions that needed to be made
ranged in scale from the cost of individual memory sticks
to the global power usage of all of Google's datacenters.
These numbers differ by many, many orders of magnitude.
And internally, an ILP solver decides where to search next
by inspecting various ratios of constants
from the problem definition.
When these ratios get smaller than <code>10^(-16)</code>,
floating point inaccuracy creeps in and makes the solver
flounder by exploring the wrong parts of the search tree.</p>
<p>Beyond that,
ILP solvers do not always return optimal solutions.
They instead ask the user to configure an optimality gap,
and the solver stops when its best solution
is provably better than the threshold.
In particular, with a poor choice of units,
that optimality gap can hide all the "real" hardness in your problem.
Or worse, it can cause the solver to sacrifice
the quality of many small decisions,
the sum of which are small compared to the major decisions being made.
That seems like an acceptable trade-off,
until the team that is responsible for ensuring the small decisions are sensible
come yelling that the solver is making stupid decisions, and they're right.</p>
<p>Incrementally better solvers or a better model won't help you here.
It's the core NP-hardness that matters:
working out that remaining 0.01% optimality
is where the exponential growth comes in,
and proving a given solution is optimal requires
the solver to increase its lower bound on the optimal solution,
which in that last mile is effectively a brute force search.</p>
<p>The way to deal with this
is to decompose a massive problem
into subproblems.
Solve the models in order, or in parallel,
or in a dependency graph topological sort.
The decomposition naturally sacrifices global optimality
so the damn solver can finish in time.
In many cases it makes sense,
since there is a natural decomposition
that you can plan large-scale details of a datacenter building,
that shouldn't have too large an impact on any individual server rack.
But still, some decisions that must be made at scale and in aggregate
prevent more optimal choices made later.
So the challenge is to find a decomposition that is workable
for the business.
But it all starts from the failure to solve an NP-hard problem.</p>
<p>Similar examples pop up all throughout applied solvers.
I'll give two brief ones that I expand on in <em>Practical Math</em>.
The first is in version selection for package managers.
The leading techniques are based on CDCL solvers—see PubGrub—but
they can't use existing solvers as a black box because
they are too slow. The Conda package manager made a big issue of its
performance problems <a href="https://www.anaconda.com/blog/understanding-and-improving-condas-performance" target="_blank">in this
article</a>,
and it boils down to "SAT solvers are slow."
Even more modern solvers like PubGrub
needed to reimplement CDCL from scratch
so that they could inject custom domain-aware heuristics
to make it performant enough for modestly-sized package ecosystems.</p>
<p>Another, which I have been studying more recently
after an excellent interview with <a href="https://lily-x.github.io/" target="_blank">Lily Xu</a>,
is in how ILP solvers were applied to wildlife conservation.
More specifically, wildlife sanctuaries around the world have to deal with poachers
and other things that threaten wildlife.
On the poaching side, the math problem is in how to schedule ranger patrols
in the park so as to best combat poaching.
The original techniques that Xu and her colleagues tried to use,
based on some prior work of Milind Tambe and his group on <a href="https://teamcore.seas.harvard.edu/publications/guards-innovative-application-game-theory-national-airport-security" target="_blank">patrol scheduling
at airports for counterterrorism</a>,
used ILPs to solve a game-theoretic
formulation of the scheduling problem.
But they found ILP solvers couldn't scale to the problem sizes they needed.
In fact, Tambe and his other students found the same scaling problems
even staying within the setting of patrol scheduling for counterterrorism.
Both Xu's and Tambe's work branches off to their novel approaches
from the starting obstacle of needing to solve impractically large ILPs.
And in both cases, they have some manner of decomposing the problem
into sub problems,
with a lot of extra work to analyze that the results still produce an optimal solution
(otherwise it's hard to publish, I'm sure).</p>
<p>All that is far from the hardest part of applying math
to wildlife conservation, but it supports my claim that you can't treat NP-hard problems as if they were solved, even in practice and even on "real world" instances.</p>Fri, 25 Aug 2023 14:00:00 +0000https://buttondown.com/j2kun/archive/np-hard-does-not-mean-easy/Thoughts about what worked in math circleshttps://buttondown.com/j2kun/archive/thoughts-about-what-worked-in-math-circles/<h1>Thoughts about what worked in math circles</h1>
<p>After about 7 months of math circles
with a group of 7- turning 8-year-old boys and girls,
I decided to take a break to breathe
and reflect on what worked and what didn't.</p>
<p>It's interesting how big a gulf there is
between what math topic you think will be interesting
to a 7-year-old
and what actually captures their attention.
Let me start by giving some examples
of things I <em>thought</em> would catch their interest
but flopped.</p>
<ul>
<li><a href="https://amzn.to/3DiBtPY" target="_blank">The game SET</a>.</li>
<li><a href="https://en.wikipedia.org/wiki/Fold-and-cut_theorem" target="_blank">Fold-and-cut puzzles</a>.</li>
<li><a href="https://amzn.to/3DeG9X5" target="_blank">Geometry snacks</a>.</li>
<li><a href="https://www.youtube.com/watch?v=wKV0GYvR2X8" target="_blank">Cutting a Mobius strip</a>.</li>
<li>Tessellations (<a href="https://momath.org/wp-content/uploads/2022/03/ChaimGoodman-Strauss2021RosenthalPrizeLessonPlan.pdf" target="_blank">Tooti Tooti</a>, pdf link).</li>
<li><a href="https://amzn.to/3Ko8m1F" target="_blank">Prime Climb</a>.</li>
<li>Making <a href="https://en.wikipedia.org/wiki/Flexagon" target="_blank">flexagons</a>.</li>
<li>Ruler and compass constructions.</li>
</ul>
<p>Things I didn't think they'd like but they loved:</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Knights_and_Knaves" target="_blank">Knights and Knaves puzzles</a>,
and more generally any topic about propositional logic (blue-eyed islanders.</li>
<li>Manually scheduling a round-robin tournament's worth of sports games,
trying to minimize the latency of the entire tournament. (I originally phrased it as a soccer tournament, but that week the kid who loves soccer the most didn't show up, and so it turned into a DANCE competition, which was much more fun)</li>
<li>Seven bridges problems, and trying to find the smallest "impossible" bridge problem.</li>
<li>Trying to figure out who is better at penalty kicks based on counts of scores/misses.</li>
<li>Coming up with your own Pascal's triangle-type pattern.</li>
</ul>
<p>And then there were the problems
I thought they would love, and they did.</p>
<ul>
<li>The Function Machine game (guess a function given the ability to query it as a black-box)</li>
<li>Variations of <a href="https://en.wikipedia.org/wiki/Nim" target="_blank">Nim</a>.</li>
<li><a href="https://amzn.to/43odWrr" target="_blank">The Turing Tumble</a>.</li>
<li>Fair cake cutting (with real cookies and oddly-distributed toppings).</li>
<li>Game theory games like Prisoner's dilemma and Chicken.</li>
<li>Making mocktails for 3 people, given a recipe for 1 drink, and again with a recipe that serves 5. (fractions practice). Measurements were in quarters of ounces.</li>
</ul>
<p><img alt="cookies with oddly distributed toppings" class="newsletter-image" src="https://assets.buttondown.email/images/281bd56a-61a7-4e26-84c1-98dfd6638c8b.jpg?w=960&fit=max" />
<img alt="A poster with two recipes on it, the first making 1 drink and the second making 5" class="newsletter-image" src="https://assets.buttondown.email/images/1a6c1f86-5791-4c67-ad24-7aaf00f73fb2.jpg?w=960&fit=max" /> </p>
<p>Back in 2019 I wrote an article,
<a href="https://github.com/j2kun/essays/blob/main/attention-spans-for-math-and-stories.md" target="_blank">Attention spans for math and stories</a>
in which I described how I have used storytelling
with kids of various ages
(not in a math context)
to get them participating in activities and feeling welcome in a group.
I tied it back to math,</p>
<blockquote>
<p>To have good mathematical content revolving around stories,
mathematicians should learn to tell stories well.</p>
</blockquote>
<p>Somehow, though, my story-telling game
was off during some of my math circles.
I think through all my reading and learning about math circles,
I had internalized a different viewpoint.
That viewpoint, roughly speaking,
is that the math should speak for itself.
Much math circle literature
seems to suggest that a facilitator
should start with an open-ended mystery
(like, "can you draw a perfect shape?")
and then sit quiet until the participants
start asking questions and exploring on their own
(in this example, maybe discovering ruler and compass constructions).</p>
<p>I suspect this might work
with an older group of students,
or a group of students who have already bought into math
in some sense.
I know, for example, that many parents
who find math circles are desperately
searching for resources because their kid's
math ability is beyond their comprehension.
Some of these kids are believed to be on the autism spectrum as well.</p>
<p>The group I worked with were as typical
upper middle class kids as you could find.
They came to math circle after soccer practice.
When they got bored during the circle they
goofed off and roughhoused.
Though they had never played "Among Us,"
they constantly called things "sus."
More importantly,
they just didn't care about number patterns
unless it was couched in a more engaging format.
Geometry was a complete dud,
because it's even harder to come up with stories
about arbitrary figures with shaded regions you want to find the area of.
They didn't think tessellations were pretty.
And they didn't have the dexterity required
to make enough cuts and folds
to construct objects out of paper,
so they ended the fold-and-cut
and flexagon activities feeling frustrated.</p>
<p>But the idea that there are two people,
one of whom is a ROTTEN LIAR,
and you have to figure out who is the liar
based on the clues in what they say?
That's gold.
For that knights and knaves puzzle,
I wrote down the phrases Alice and Bob
said on pieces of paper,
taped them to plastic straws,
and held them up like signs.
The first puzzle was:</p>
<ul>
<li>Alice: Bob is a liar!</li>
<li>Bob: Neither of us are liars.</li>
</ul>
<p>The kids all jumped to say who they thought was the most "sus",
and they predicted that Alice was suspicious for blaming Bob.
With a bit of discussion,
they realized that they can't both be telling the truth
because their claims are mutually contradictory
(though they didn't have the word for this and often called it "opposites").
Then we talked about how many possibilities there are.
First they thought 2, then 4.
It's 4: two liars, liar truth-teller,
truth-teller liar, and two truth-tellers.
Then we proceeded to inspect each option,
and after a while they agreed
that the only possibility was that
Alice was telling the truth and Bob was lying (correct!).</p>
<p>Then an interesting thing happened.
As I moved on to the next knights-and-knaves puzzle they asked if they got the first one right.
And I said, "well, do you think you got it right?"
One boy admitted he wasn't entirely certain,
and then we all agreed to keep thinking about it
until we were convinced by the proof.
So it was planting the seed:
how do you know if you've truly proved something?
Later in that session he remarked to one of the girl's comments,
"No, we already proved that case!"</p>
<p>After the months of weekly sessions, however,
I think the kids are warming up to the idea
of the math speaking for its own sake.
While I led the Seven Bridge of Konigsberg
problem with a story,
they quickly discarded the story and focused on the challenge.
And in a later session they asked to revisit it
because they wanted to try to find the "smallest impossible bridges problem."
Indeed, they did!
While we didn't solve the original problem,
the idea of simplifying a problem,
making it smaller and smaller until you can solve it,
and then gradually adding back complexity,
was clearly on display those weeks.</p>
<p>And though they hated flexagons,
for an "end of the session" gift
I made them each a hexaflexagon
from a <a href="https://thinkzone.wlonk.com/Flexagon/Flexagon.htm" target="_blank">printed template</a>
which critically had bright, distinct pictures on each face.
Weeks later, one girl's dad
told me that she is obsessed with it,
only recently discovering the sixth face.
I let him know that you can draw a map of the flexagon's movements,
much like the Seven Bridges maps we drew,
to get a clear picture of the whole configuration space.
I have yet to hear back whether they were able to figure it out.</p>
<p>So I have hope that this group can graduate to appreciating
the math for its own sake.
But if I were to start with a new group
of kids who had no <em>a priori</em> love of math,
I'd have to be a bit more deliberate in framing
the problems in engaging stories.</p>Mon, 17 Jul 2023 18:10:10 +0000https://buttondown.com/j2kun/archive/thoughts-about-what-worked-in-math-circles/Structuring Technical Blog Postshttps://buttondown.com/j2kun/archive/structuring-technical-blog-posts/<p>Recent discussions with colleagues and friends have me thinking a bit about the writing process, and how to structure technical exposition. I am one of those weird people who just sits down and starts writing when I have an idea, but over the years I have ended up building a few formulas, loosely speaking, for technical blog posts.</p>
<p>In the interest of inspiring more writing, here are a few of the structures I have used.</p>
<h2>Explaining to myself</h2>
<p>This one is pretty self explanatory: I write an article that explains a topic to my former self at some point in history. This one works well for isolated articles because you can assume any amount of prior knowledge you'd like. It can be yourself from two weeks ago, knowing most of what you know now but unaware of some key insight. Or it could be to yourself from two years ago before you took some course, read some book, or started and finished some large project.</p>
<p>This advice is a bit trite. If you read about writing, you have probably heard William Zissner's quote from "On Writing Well," which goes something like "you are writing for yourself." In Zissner's case it was more like, you are writing about things that speak to you and excite you, and if you do it genuinely it will excite other people, too. But with technical writing there's more to it. What confused you almost certainly confuses others. There have been so many times in my life when a single sentence or shift in perspective has illuminated an entire topic. What <em>confounded</em> and <em>frustrated</em> you is among the best material for technical writing, especially if you overcame the obstacles and found a new insight.</p>
<p>This kind of article often focuses on a single nagging question, like explaining the meaning behind a jargon-laden sentence or starting from a naive (or wrong) understanding of a topic and working to correct it. A good sub-section to include in such an article is the "if I went down this a-priori-reasonable path, where would I get stuck or fail?" In math in particular, dead ends are almost never described in detail.</p>
<p>Some good examples of this in my old writing includes <a href="https://jeremykun.com/2014/01/17/how-to-conquer-tensorphobia/" target="_blank">"How to Conquer Tensorphobia"</a>, <a href="https://jeremykun.com/2014/02/16/elliptic-curves-as-algebraic-structures/" target="_blank">"Elliptic Curves as Algebraic Structures"</a>. Though I did less of the "follow the wrong path on purpose" recipe in those articles, I have been doing it more recently and finding it works exceptionally well.</p>
<h2>The touchstone example</h2>
<p>This can be mixed with other formulas, but the idea here is to explore a topic through the lens of a touchstone example. The example should be representative of the point you're trying to make, either by being a particularly good counterexample to some naive reasoning, or else demonstrative of some realistic data one expects to encounter in the real world.</p>
<p>For instance, you might be studying graph theory, and demonstrate every definition and theorem in terms of the <a href="https://en.wikipedia.org/wiki/Petersen_graph" target="_blank">Petersen graph</a>, a graph so notorious that math lore insists you have to test every conjecture on it first, since it's the most likely to provide a counterexample. Or, if you're studying graphs in order to motivate something like dependency resolution, you can come up with a (hopefully small) dependency graph that demonstrates the key obstacles to dependency solving, and use that to show how different algorithms work. In this applied setting, the touchstone example is a carrot to keep a reader interested, especially if that reader might struggle to keep up with dense technical content. Put simply, the best way to demonstrate a definition of theorem is to show an example that truly needs it.</p>
<p>For example, in my <a href="https://jeremykun.com/2012/12/08/groups-a-primer/" target="_blank">series on groups</a>, modular integers are the touchstone example (and, to some extent, dihedral groups). But if I had known about it at the time, I could have used <a href="https://jeremykun.com/2021/10/14/group-actions-and-hashing-unordered-multisets/" target="_blank">Hashing in Unordered Multisets</a> as a touchstone example to build up finite abelian group theory, as it covers all the main components of the theory, such as group actions and the classification of finite abelian groups.</p>
<h2>Learning as you go</h2>
<p>Here you document the learning process over a series of smaller articles as you learn it yourself. It's good fodder if you can't think of anything to write, because it forces you to learn the material better by reorganizing it in your own words.</p>
<p>The trap here is falling into a pattern of repeating the order or structure of an existing book. It makes the writing process much more boring. It's better to revolve this around your own examples and personal thoughts and confusions. E.g., you might have misread a definition, and then overcome it by finding an example that needs the missing part of the definition.</p>
<p>Finally, this mode of writing makes the "what if I try a naive idea" feel more genuine, because it is you actually trying out naive ideas, and recording the ones that aren't too silly.</p>
<p>Some examples include a <a href="https://jeremykun.com/2022/03/24/silent-duels-constructing-the-solution-part-2/" target="_blank">series of 4 articles on Silent Duels</a> and handful of articles <a href="https://jeremykun.com/2022/08/29/key-switching-in-lwe/" target="_blank">like this one</a> on the math behind fully homomorphic encryption, which I'm learning as I officially joined the field six months ago.</p>
<h2>Commit by commit</h2>
<p>In this style you are showing the evolution of a software project over time. You write explaining how a project is built up, and then you link to specific commits and pull requests as demonstration.</p>
<p>It helps particularly in showing how some ideas navigate into a proper codebase in situ. And if you organize the commits well, you can isolate the changes that are really useful to expose in the writing, while boilerplate and formatting and all that other stuff is skipped. This also gives the reader a means to check out the codebase at a given commit to experiment.</p>
<p>The hard part here is ensuring the commits are clean enough, and that the project is functional at the intermediate commits, so that a reader who engages with it won't be too lost or stuck. Small commits are your friend here.</p>
<p>I did this technique only once, with a <a href="https://github.com/j2kun/riemann-divisor-sum" target="_blank">series of 9 articles on searching for Riemann Hypothesis counterexamples</a>. The main goal there was to teach some software engineering ideas, not so much the math, but I think it could work well for math as well. Say, to demonstrate the nuances of math when it gets into an actual program (e.g. how it can help or hurt in integration, testing, or maintenance).</p>
<h2>The "what if" trickle</h2>
<p>Perhaps the most popular form of math writing out there: the style that is popular among internet math superstars like Grant Sanderson, Vi Hart, and the Numberphile crew. It usually has the structure of "trickling" the reader along with a series of "but what if..." questions. "But what if we tried <em>multiplying</em> these two things that have no business being multiplied? Is it even possible to make sense of that?" It often requires some suspension of disbelief on the reader's part, especially when treating a topic like complex numbers where you really are starting with "illegal" operations but perform them anyway to see what falls out.</p>
<p>These work, and are great for the kind of content where the reader is swept away on a mathematical joy ride, passively watching as the expert reveals the magic trick. This format doesn't always land well, particularly when the thing you're talking about isn't as smoothly ironed out as, say, Fourier analysis or a classical math puzzle. If you can't rest on decades of prior insights and explanations, or if your topic is esoteric, or if you feel like you're the first one to widely communicate a particular insight, you have to be deliberate about why the reader should care about what you have to say. My focused insights about <a href="https://jeremykun.com/2022/08/29/key-switching-in-lwe/" target="_blank">error growth in homomorphic encryption key switching</a> is never going to be on Numberphile, but it's extremely hard to find proper documentation about it, and what you can find universally describes it as "well known" and omits the details. I provide the details. That sort of content will never fit well in the "what if" trickle.</p>
<h2>What about you?</h2>
<p>What sorts of formulas have you developed for writing technical content? What works and what doesn't? Reply to the email and let me know your thoughts. It's way friendlier than internet comment threads.</p>Wed, 17 May 2023 16:27:56 +0000https://buttondown.com/j2kun/archive/structuring-technical-blog-posts/What happens when it's all streamlined?https://buttondown.com/j2kun/archive/what-happens-when-its-all-streamlined/<p>One of my son’s favorite songs these days is (oddly, and thanks to recommendation algorithms) <a href="https://www.youtube.com/watch?v=zR_KZ4bBglM" target="_blank">“The Worker’s Song”, as performed by The Longest Johns</a>. My son calls it the “Pie in the Sky Song”, because of the line, “We’re the first ones in line for that pie in the sky.”</p>
<p>There’s an interesting verse in the song:</p>
<blockquote>
<p><em>For our skills are not needed.</em></p>
<p><em>They’ve streamlined the job.</em></p>
<p><em>With slide rule and stopwatch,</em></p>
<p><em>Our pride they have robbed.</em></p>
</blockquote>
<p>It’s hard to avoid talking about ChatGPT or Bard these days. In
<a href="https://www.reddit.com/r/learnpython/comments/zifoql/just_use_chatgpt_will_programmers_become_obsolete/" target="_blank">conversation</a>,
<a href="https://cacm.acm.org/blogs/blog-cacm/268103-what-do-chatgpt-and-ai-based-automatic-program-generation-mean-for-the-future-of-software/fulltext" target="_blank">essays</a>,
and <a href="https://arxiv.org/abs/2211.05030" target="_blank">papers</a>, many programmers are excited to
find new applications, and worried that jobs like programming will become more
about
<a href="https://buttondown.email/hillelwayne/archive/programming-ais-worry-me/" target="_blank">proofreading</a>
than engineering.</p>
<p>In one conversation I let my optimist out of his cage. I worked with the
assumption that the capabilities of these tools would grow even modestly beyond
what they can do now. And I thought about what tools might be realized in that
near future. Nothing outlandish.</p>
<p>One idea was a tool to help me navigate an unfamiliar codebase. Wouldn’t it be
nice if you could give a large language model a link to a GitHub repo, and it
would summarize the file organization, data structures, and system
architecture? Or better, prompt it with the text of a bug you’d like to fix or
feature you’d like to add, and it would give options for how to add the feature
in accordance with the style of the codebase?</p>
<p>It would even be useful just for studying. Recently I was looking at the
codebase for the <a href="https://github.com/berkeley-abc/abc/" target="_blank">ABC circuit optimizer</a>
and thought, where exactly are the core optimization routines? There’s a
directory called
<a href="https://github.com/berkeley-abc/abc/tree/master/src/opt" target="_blank">src/opt</a>, but short
of reading through every header and digging around to figure out what the terse
comments mean (e.g., <code>PackageName [Cut sweeping.]</code>), it will take a
conversation with a maintainer to really get a handle on it.</p>
<p>Another idea was to summarize the development history of a repository, given
all the activity in a GitHub project, such as the commits and messages, issues,
release notes, etc. It could even be smart enough to discover transitive
updates. As my friend pointed out, when project <code>A</code> upgrades the version of
dependency <code>B</code>, <code>A</code> could silently pull in new features of <code>B</code>, and so these
new features are omitted from the release notes. The example he gave was an
image library adding support to export to png based on the requested filename
you save an image as. The maintainers of <code>A</code> might not know to include this in
release notes, but the tool could.</p>
<p>Tools like these would drastically lower the barrier to entry for contributing
to and fixing bugs in upstream systems. When I’m working with a dependency and
it doesn’t work the way I need, the effort required to fix or improve it
upstream often drives me to find a workaround. There’s no way I could convince
the maintainer (if there even is one) to change their code, and it’s too much
of a rabbit hole to fix it myself, mainly because understanding the codebase is
too steep of a prerequisite.</p>
<p>A friend of mine and I considered doing a livestream where we “speedrun”
learning an unfamiliar codebase, by some measure like adding a new feature or
fixing a bug. And then interspersing speedruns with an analysis of our methods.
It would be interesting to see what sorts of tools you could build to
streamline that process.</p>
<p>And that’s what this all boils down to. Like the Worker’s Song, as a field we
programmers are always about “streamlining” our jobs. We’re the first ones in
line to build the pie-in-the-sky tool. My fantasy about large language models
is that they might help me streamline the ramp up process. It would make me
feel very smart and powerful.</p>
<p>But I also stopped to wonder, what would happen if that dream came true? Surely
not only I would be a ramp-up wizard. Anyone could be. My familiarity with
the codebase, my experience with outages and bugs, my contextual knowledge
about tech debt, would all become cheap, especially given how much I tend to
write in bug reports and issue threads/design docs. One of my strongest skills,
clear communication, fuels the engine that ushers in my obsolescence.</p>
<p>It’s not the first time someone has pointed out these effects. The modern form
of the slide rule, after all, was invented by a <a href="https://en.wikipedia.org/wiki/Slide_rule#Modern_form" target="_blank">French artillery
lieutenant</a>, the same
folk who, in the Worker’s Song, are “given a gun and pushed to the fore.”
Programmers today talk about passing the bus test: would your team/system be
crippled by your absence due to being hit by a bus? Passing it necessarily
makes you replaceable. Engineering management at large companies strives to
make individual engineers replaceable. To a significant extent, this infroms
why tech companies invest in building frameworks, infrastructure, and
evangelism. These attitudes have already conquered management of data centers. At
Google there’s a saying, “machines are cattle, not pets.” When one machine goes down,
the scheduler restarts the occupying jobs on other machines. Will this saying
evolve into, “software is commodity, not craft”? “Programmers are fungible, not
critical”?</p>
<p>When employees are disposable, they will be disposed of. But somehow I
downplayed this thought because, at least I had the benefit of being very
good at ramping up to productivity quickly, and synthesizing knowledge well,
and understanding hard things and making them easier to understand for others.
I value and invest in mentorship of my team,
and in building shared mental models of a system.
But how much of that investment could large language models automate away? And how eager are we,
collectively, to build that future?</p>
<p>I still want that tool.</p>Sun, 19 Feb 2023 18:20:00 +0000https://buttondown.com/j2kun/archive/what-happens-when-its-all-streamlined/A Foray into Math Circleshttps://buttondown.com/j2kun/archive/a-foray-into-math-circles/<p>This year I’ve started facilitating math circles in Portland. For those who don’t know, a <a href="https://en.wikipedia.org/wiki/Math_circle">math circle</a> is a small, extracurricular gathering of similarly-aged children to explore math topics. Usually an hour long, a math circle serves simultaneously as a means to introduce kids to math topics outside the standard curriculum, to encourage critical thinking, and as a social space for kids who like math to enjoy it without a social stigma. Math circles typically have a single adult facilitator, whose goal is to present an “accessible mystery,” and then do their best to help the kids navigate their own thoughts and questions about it.</p>
<p>I attempted three math circles this year—that is, three different groups of students, each with multiple sessions—and I only really consider the last one a success. This was also prompted by a short online course/discussion group run by <a href="https://www.theglobalmathcircle.org/">The Global Math Circle</a> which asked each member to run a real math circle so they could discuss it. And I read <a href="https://www.amazon.com/Math-Three-Seven-Mathematical-Preschoolers/dp/082186873X?&linkCode=ll1&tag=mathinterpr00-20&linkId=9fe928746b20d89b6a9c40d81db1b323&language=en_US&ref_=as_li_ss_tl">Math from Three to Seven</a>.</p>
<p>My first group of kids was my next door neighbor’s 8-year-old daughter and two of her friends. Our first circle was on the introductory muffin problem from Bill Gasarch’s book <a href="https://www.amazon.com/Mathematical-Muffin-Morsels-Problem-Mathematics/dp/9811215170">Mathematical Muffin Morsels</a>: how can you evenly split 5 muffins among 3 students so that no student gets a smaller piece than any other? it was mostly a flop. The girls didn’t find the problem very interesting, and it was simply too advanced. To wit, they quickly figured out that you can give each student a full muffin and split the other two muffins pieces of size <code>{2/3, 1/3}</code>, so that one kid gets two 1/3-size pieces. And they agreed that you couldn’t possibly make it work if every muffin piece had to be size 1/2. But then they were unable to think of any fractions that were between 1/2 and 1/3. We tried talking about that, and while one girl was keen on it, the other two decided it was time to goof off. The second and third circles I did with that group weren’t much better, and then Summer and friend-group dynamics mixed up everyone’s schedules and preferences and it fell apart.</p>
<p>A second group was four 6-year old boys and girls in the neighborhood my parents live in. There was a 6-year old boy whose family lived down the street from my parents, and one day when I was walking my toddler down the street with my family, we met them and I mentioned I’m a mathematician. The boy piped up that he loved math. A few days later I brought him a 15 sliding-block number puzzle, and later a graph coloring puzzle booklet. Then his mom found some neighbor kids who wanted to try a math circle. This group was a bit more successful. We did some combinatorics puzzles (counting ways to stack 3 blue and 2 yellow blocks), and basic versions of Nim. And some activities that didn’t work out so well, like the fold-and-cut theorem—which was hard because they didn’t have the dexterity to fold and cut paper precisely enough. After four sessions, still, enough students didn’t want to return that it was just down to the original 6-year-old boy, and we couldn’t really have a math circle with one kid—part of the point is to get the kids to teach and learn from each other.</p>
<p>The third and most successful group consisted of 7-year-old siblings of the kids at my son’s preschool, which he started going to after the other two circles ended. We did counting puzzles, different flavors of Nim, <a href="https://en.wikipedia.org/wiki/Knights_and_Knaves">Knights and Knaves puzzles</a>, fold-and-cut theorem, the <a href="https://en.wikipedia.org/wiki/No-three-in-line_problem">“no three in a line” puzzle</a>, and more. And most of the kids were pretty interested in the topics. Some of the kids were <em>really</em> interested in the topics, as much as a 7-year-old can be, I think. And I’m hoping to start with this group again in 2023.</p>
<p>Facilitating math circles has been a lot of fun, despite regular issues with “classroom management” (i.e., stopping them from roughhousing or distracting each other too much from the math). I wanted to share a few reflections, aside from the typical thoughts on which topics worked and which didn’t.</p>
<p>The first is that it’s hard to find willing participants! If I didn’t happen to run into neighbors with willing kids and friends, or I didn’t happen to have a kid myself, I probably would have to resort to awkward things like advertising? Or contacting a local school? But even with a kid of my own, it’s hard to say, “please send your child to my house once a week so we can do math in my basement (without you in the room)!” Having even a brief prior relationship with the family of at least one kid really helped.</p>
<p>Second, explaining the point of our activities to the parents seemed to help motivate the parents to encourage their kids to keep coming. In my first group the parents weren’t there (and most of the activities flopped), and in the second group one or two of the parents sat in, but for the parents who didn’t, their kids seemed to peter out. Often the kids will say things like, “how is this math?” which is of course expected, but I can only imagine their failures to explain to their parents what they did (especially the younger kids). For the third group, after each circle I sent a pretty detailed email to all the parents with:</p>
<ul>
<li>A description of the activities we did and why they were mathematical.</li>
<li>Interesting ideas that each specific kid had (if I could remember them).</li>
<li>A list of extensions/tweaks they could do with their kids at home, if desired.</li>
</ul>
<p>And the parents did continue to engage them between sessions, at least for the kids who really enjoyed it.</p>
<p>Third, I was continually surprised by which activities the kids enjoyed and didn’t enjoy. And as a result, I had to maintain a significant number of “backup” activities in case my choices flopped. One problem I thought would be a shoe-in was the game of <a href="https://en.wikipedia.org/wiki/Set_(card_game)">Set</a>, but it flopped immediately, and was worse because they couldn’t help but shout over each other and grab all the cards whenever they thought they saw a set.</p>
<p>One activity I thought would flop, but ended up being a huge hit, was Knights and Knaves. In that activity, Alice and Bob are either always-truthful (knights) or always-liars (knaves). They each say one statement, from which you have to ascertain their identities from the four options. In the first puzzle I gave them, Alice says, “Bob is a liar!” and Bob says, “Neither of us are liars.” The kids immediately thought Alice was a liar because she was accusing Bob, and they all declared she was the most “sus.” But after exploring it, they found Alice was the truth-teller. I think it also helped that I wrote Alice and Bob’s statements on physical paper signs and taped them to chopsticks so I could hold them up when I posed the problem. Physical manifestations of problems always seem to help, even in this superficial way.</p>
<p>Fourth, I struggled to find the right balance of admitting some fun and goofing off, while still having it be focused enough on math. I don’t want math circle to be strict. If kids are bored and goofing off, that’s a signal that we need to switch activities, and acting the taskmaster will probably ruin their attitudes for math. And sometimes their goofing off is an expression of creativity, and I can channel that to math if I’m quick on my feet. As one example, with my third group I tried a (failed) activity about standing waves in a slightly-taut rope. They were going way to wild waving the ropes around. And then one kid said something like, “look, I can send secret messages!” (by “pulsing” transverse waves up/down or left/right). I quickly pivoted to an activity where two kids held each end of a rope. One kid asked a question, and the other kid answered it only by sending waves down the rope. They were asking all counting questions, and it got interesting when they asked me “how old are you?” and I ‘lost count’ (then they invented a scheme to communicate in base-10). And then non-counting questions (“what’s your favorite color?” or “what’s your favorite food?”). That turned out to be a really neat activity, and all it took was to identify a clever thought, and push it farther so that they had to start to think critically about it.</p>
<p>The hard part with kids goofing off is when it’s <em>just one</em> student that is bored and the rest are really invested. Invariably, boredom manifests as distracting the other kids. And the kids who were interested would even snap back, “I’m trying to think about this puzzle!” With one egregious case I even had to ask one kid not to come back. I did this by talking to her parents and explaining that she doesn’t seem interested, describing what her distracting behavior was, and explaining that it was hurting the experience for the other kids and that making her come back would likely be worse in the long run than stopping and trying again in a year or two.</p>
<p>In the end, I think the most important lessons I learned about running math circles from this is to never trust your intuition about what the kids will latch on to, to bias yourself toward easier activities with a high ceiling, and to have lots of backup activities. So if you know of any fun activities that would make a good circle, please let me know! You can reply to this email or hit me up <a href="https://mathstodon.xyz/@j2kun">@j2kun@mathstodon.xyz</a>.</p>Sat, 24 Dec 2022 16:00:00 +0000https://buttondown.com/j2kun/archive/a-foray-into-math-circles/Career update: homomorphic encryptionhttps://buttondown.com/j2kun/archive/career-update-homomorphic-encryption/<p>I recently had my final week working in the data center planning side of Google. July would have marked five years of my focus on integer linear programming, modeling data center planning problems as such, and building supporting infrastructure to help explain and understand optimization models.</p>
<p>I've decided to transition to a more ambitious and challenging role in cryptography, still at Google. Specifically, I'll be joining <a href="https://twitter.com/_shruthig" target="_blank">Shruthi Gorantala</a> and <a href="https://cathieyun.github.io/" target="_blank">Cathie Yun</a> (and a variety of community contributors) on a mission to bring <a href="https://en.wikipedia.org/wiki/Homomorphic_encryption" target="_blank">Fully Homomorphic Encryption</a> (FHE) to production systems. Last year Shruthi and her collaborators open-sourced a <a href="https://github.com/google/fully-homomorphic-encryption" target="_blank">transpiler</a> (<a href="https://research.google/pubs/pub50428/" target="_blank">associated paper</a>) that accepts as input a subset of C++ programs and produces as output an equivalent program that operates on ciphertexts in the <a href="https://tfhe.github.io/tfhe/" target="_blank">Torus-FHE scheme</a>. The team has big ambitions, which I can't disclose yet, but my role will be closer to my mathematical expertise and passion than my current role.</p>
<p>Working on FHE will be something of a dream come true, and a once in a lifetime opportunity. FHE was discovered to be possible around the same time I discovered mathematics as a passion (2009). When I learned about it in the intervening years, it was still not considered efficient enough to be practical. Today, through the sweat and tears of incremental progress on multiple research avenues, it seems we (humanity) are on the cusp of being able to make FHE practical enough for useful applications, with the right combination of software ingenuity, hardware accelerators, and fine tuning. Combine that with Google's iteratively improving culture around privacy, and it's clear that now is the best time to get involved at the forefronts of privacy innovation. These formerly theoretical breakthroughs will soon be ready for production.</p>
<p>With so much of the cryptography ecosystem usurped by fraud, Ponzi schemes, and extremist politics, it's refreshing to immerse myself in a side of that field that feels unambiguously wholesome. In <a href="https://buttondown.email/j2kun/archive/what-makes-a-community/" target="_blank">past newsletters</a> I've hinted at the question that Jenny Odell posed in her book, <a href="https://www.amazon.com/How-Do-Nothing-Resisting-Attention/dp/1612197493" target="_blank">How to do Nothing</a>, namely, "What's it all for?" When I think about software reliability and "five 9's," the only realistic answer I can think of is, "Keep people shopping on Black Friday." Or when I think about social media, it nags me to think it's keeping your eyes glued to YouTube/Netflix, Twitter outrage, TikTok, or conspiracy theory Facebook groups, ultimately all for shallow engagement metrics and more ad clicks.</p>
<p>But the end game for FHE is simple: all computation is private by default. It's the "holy grail" of privacy technology. Of course, scaling efficiency poses the main obstacle. FHE seems to require large ciphertexts and keys, and even the fastest FHE schemes rely on large matrix multiplications.</p>
<p>According to a heuristic argument, to run a computer program without any knowledge of the data, you necessarily remove the ability to know which branch of an if statement the program takes, or how many iterations of a loop are executed. Otherwise, a program that terminates early when the input matches a fixed value would contradict the security of the scheme. So an FHE scheme must effectively simulate all possible program branches in its effort to hide the data.</p>
<p>It remains to be seen whether this obstacle can be mitigated enough to admit decent latency in general-purpose applications, as well as what the extra computational cost of perfect privacy shakes out to be. If the costs aren't prohibitively high, then the promise of perfect privacy can tip the balance in favor of FHE by default for all applications where privacy is a concern.</p>
<p>As wholesome as my new mission feels, switching teams is a bittersweet transition. It's largely because the last six months on my supply chain team have been the best. Due to some combination of old guard attrition, my nurturing a small, but productive team, and our consistent delivery of results, I had been suddenly given more engineering resources. People were listening to my ideas, and I made the time to demonstrate the impact and elegance of those ideas. It really hit home when my manager's last performance review feedback highlighted my efforts to foster both productivity and a positive culture in the team, resulting in a group of happy, well-rounded engineers that punched above their weight. In many ways that meant something to me in a way that getting to say I helped save Google $xxxM per year did not.</p>
<p>But in the end, I just couldn't watch this opportunity sail by without jumping aboard and rowing with gusto. So here's to cultivating a merry band of cryptographers and helping make privacy-centric computation a reality!</p>Tue, 31 May 2022 16:48:31 +0000https://buttondown.com/j2kun/archive/career-update-homomorphic-encryption/