You can cheat a test suite with a big enough polynomial

could

                June 24, 2025

            You can cheat a test suite with a big enough polynomial

                TDD Gone Wrong (Gone Z3xual)

            Hi nerds, I'm back from Systems Distributed! I'd heartily recommend it, wildest conference I've been to in years. I have a lot of work to catch up on, so this will be a short newsletter.
In an earlier version of my talk, I had a gag about unit tests. First I showed the test f([1,2,3]) == 3, then said that this was satisfied by f(l) = 3, f(l) = l[-1], f(l) = len(l), f(l) = (129*l[0]-34*l[1]-617)*l[2] - 443*l[0] + 1148*l[1] - 182. Then I progressively rule them out one by one with more unit tests, except the last polynomial which stubbornly passes every single test.
If you're given some function of f(x: int, y: int, …): int and a set of unit tests asserting specific inputs give specific outputs, then you can find a polynomial that passes every single unit test.
To find the gag, and as SMT practice, I wrote a Python program that finds a polynomial that passes a test suite meant for max. It's hardcoded for three parameters and only finds 2nd-order polynomials but I think it could be generalized with enough effort.
The code
Full code here, breakdown below.
from z3 import *  # type: ignore
s1, s2 = Solver(), Solver()

Z3 is just the particular SMT solver we use, as it has good language bindings and a lot of affordances.
As part of learning SMT I wanted to do this two ways. First by putting the polynomial "outside" of the SMT solver in a python function, second by doing it "natively" in Z3. I created two solvers so I could test both versions in one run. 
a0, a, b, c, d, e, f = Consts('a0 a b c d e f', IntSort())
x, y, z = Ints('x y z')
t = "a*x+b*y+c*z+d*x*y+e*x*z+f*y*z+a0"

Both Const('x', IntSort()) and Int('x') do the exact same thing, the latter being syntactic sugar for the former. I did not know this when I wrote the program. 
To keep the two versions in sync I represented the equation as a string, which I later eval. This is one of the rare cases where eval is a good idea, to help us experiment more quickly while learning. The polynomial is a "2nd-order polynomial", even though it doesn't have x^2 terms, as it has xy and xz terms.
lambdamax = lambda x, y, z: eval(t)

z3max = Function('z3max', IntSort(), IntSort(), IntSort(),  IntSort())
s1.add(ForAll([x, y, z], z3max(x, y, z) == eval(t)))

lambdamax is pretty straightforward: create a lambda with three parameters and eval the string. The string "a*x" then becomes the python expression a*x, a is an SMT symbol, while the x SMT symbol is shadowed by the lambda parameter. To reiterate, a terrible idea in practice, but a good way to learn faster.
z3max function is a little more complex. Function takes an identifier string and N "sorts" (roughly the same as programming types). The first N-1 sorts define the parameters of the function, while the last becomes the output. So here I assign the string identifier "z3max" to be a function with signature (int, int, int) -> int.
I can load the function into the model by specifying constraints on what z3max could be. This could either be a strict input/output, as will be done later, or a ForAll over all possible inputs. Here I just use that directly to say "for all inputs, the function should match this polynomial." But I could do more complicated constraints, like commutativity (f(x, y) == f(y, x)) or monotonicity (Implies(x < y, f(x) <= f(y))).
Note ForAll takes a list of z3 symbols to quantify over. That's the only reason we need to define x, y, z in the first place. The lambda version doesn't need them. 
inputs = [(1,2,3), (4, 2, 2), (1, 1, 1), (3, 5, 4)]

for g in inputs:
    s1.add(z3max(*g) == max(*g))
    s2.add(lambdamax(*g) == max(*g))

This sets up the joke: adding constraints to each solver that the polynomial it finds must, for a fixed list of triplets, return the max of each triplet.
for s, func in [(s1, z3max), (s2, lambdamax)]:
    if s.check() == sat:
        m = s.model()
        for x, y, z in inputs:
            print(f"max([{x}, {y}, {z}]) =", m.evaluate(func(x, y, z)))
        print(f"max([x, y, z]) = {m[a]}x + {m[b]}y",
            f"+ {m[c]}z +", # linebreaks added for newsletter rendering
            f"{m[d]}xy + {m[e]}xz + {m[f]}yz + {m[a0]}\n")

Output:
max([1, 2, 3]) = 3
# etc
max([x, y, z]) = -133x + 130y + -10z + -2xy + 62xz + -46yz + 0

max([1, 2, 3]) = 3
# etc
max([x, y, z]) = -17x + 16y + 0z + 0xy + 8xz + -6yz + 0

I find that z3max (top) consistently finds larger coefficients than lambdamax does. I don't know why.
Practical Applications
Test-Driven Development recommends a strict "red-green refactor" cycle. Write a new failing test, make the new test pass, then go back and refactor. Well, the easiest way to make the new test pass would be to paste in a new polynomial, so that's what you should be doing. You can even do this all automatically: have a script read the set of test cases, pass them to the solver, and write the new polynomial to your code file. All you need to do is write the tests!
Pedagogical Notes
Writing the script took me a couple of hours. I'm sure an LLM could have whipped it all up in five minutes but I really want to learn SMT and LLMs may decrease learning retention.¹ Z3 documentation is not... great for non-academics, though, and most other SMT solvers have even worse docs. One useful trick I use regularly is to use Github code search to find code using the same APIs and study how that works. Turns out reading API-heavy code is a lot easier than writing it!
Anyway, I'm very, very slowly feeling like I'm getting the basics on how to use SMT. I don't have any practical use cases yet, but I wanted to learn this skill for a while and glad I finally did.

Caveat I have not actually read the study, for all I know it could have a sample size of three people, I'll get around to it eventually ↩

            If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.

    Read more:

                Solving LinkedIn Queens with SMT
                For sure easier than solving it in SAT!

                Snippet Praxis
                Copypasting from stuff is  A Good Idea

Don't miss what's next. Subscribe to Computer Things:

Join the discussion:

            Rémi

                Jun. 25, 2025, morning

    Hi Hillel, nice reading thanks! This reminded me a lot of Lagrange interpolating polynomial, so I wrote a quick follow-up, allowing me to find absolutely disgusting polynomials which pass some given test suite. 
The post: https://blog.remigerme.xyz/cs/test-suite-polynomial

        Reply
Report