Finding hard 24 puzzles with planner programming

up

                May 20, 2025

            Finding hard 24 puzzles with planner programming

                (1+5)^5/324 = 24

            Planner programming is a programming technique where you solve problems by providing a goal and actions, and letting the planner find actions that reach the goal. In a previous edition of Logic for Programmers, I demonstrated how this worked by solving the 
24 puzzle with planning. For reasons discussed here I replaced that example with something more practical (orchestrating deployments), but left the code online for posterity.
Recently I saw a family member try and fail to vibe code a tool that would find all valid 24 puzzles, and realized I could adapt the puzzle solver to also be a puzzle generator. First I'll explain the puzzle rules, then the original solver, then the generator.¹ For a much longer intro to planning, see here.
The rules of 24
You're given four numbers and have to find some elementary equation (+-*/+groupings) that uses all four numbers and results in 24. Each number must be used exactly once, but do not need to be used in the starting puzzle order. Some examples:

[6, 6, 6, 6] -> 6+6+6+6=24
[1, 1, 6, 6] -> (6+6)*(1+1)=24
[4, 4, 4, 5] -> 4*(5+4/4)=24

Some setups are impossible, like [1, 1, 1, 1]. Others are possible only with non-elementary operations, like [1, 5, 5, 324] (which requires exponentiation).
The solver
We will use the Picat, the only language that I know has a built-in planner module. The current state of our plan with be represented by a single list with all of the numbers.
import planner, math.
import cp.

action(S0, S1, Action, Cost) ?=>
  member(X, S0)
  , S0 := delete(S0, X) % , is `and`
  , member(Y, S0)
  , S0 := delete(S0, Y)
  , (
      A = $(X + Y) 
    ; A = $(X - Y)
    ; A = $(X * Y)
    ; A = $(X / Y), Y > 0
    )
    , S1 = S0 ++ [apply(A)]
  , Action = A
  , Cost = 1
  .

This is our "action", and it works in three steps:

Nondeterministically pull two different values out of the input, deleting them
Nondeterministically pick one of the basic operations
The new state is the remaining elements, appended with that operation applied to our two picks.

Let's walk through this with [1, 6, 1, 7]. There are four choices for X and three four Y. If the planner chooses X=6 and Y=7, A = $(6 + 7). This is an uncomputed term in the same way lisps might use quotation. We can resolve the computation with apply, as in the line S1 = S0 ++ [apply(A)].
final([N]) =>
  N =:= 24. % handle floating point

Our final goal is just a list where the only element is 24. This has to be a little floating point-sensitive to handle floating point divison, done by =:=.
main =>
  Start = [1, 5, 5, 6]
  , best_plan(Start, 4, Plan)
  , printf("%w %w%n", Start, Plan)
  .

For main, we just find the best plan with the maximum cost of 4 and print it. When run from the command line, picat automatically executes whatever is in main.
$ picat 24.pi
[1,5,5,6] [1 + 5,5 * 6,30 - 6]

I don't want to spoil any more 24 puzzles, so let's stop showing the plan:
main =>
- , printf("%w %w%n", Start, Plan)
+ , printf("%w%n", Start)

Generating puzzles
Picat provides a find_all(X, p(X)) function, which ruturns all X for which p(X) is true. In theory, we could write find_all(S, best_plan(S, 4, _). In practice, there are an infinite number of valid puzzles, so we need to bound S somewhat. We also don't want to find any redundant puzzles, such as [6, 6, 6, 4] and [4, 6, 6, 6]. 
We can solve both issues by writing a helper valid24(S), which will check that S a sorted list of integers within some bounds, like 1..8, and also has a valid solution.
valid24(Start) =>
  Start = new_list(4)
  , Start :: 1..8 % every value in 1..8
  , increasing(Start) % sorted ascending
  , solve(Start) % turn into values
  , best_plan(Start, 4, Plan)
  .

This leans on Picat's constraint solving features to automatically find bounded sorted lists, which is why we need the solve step.² Now we can just loop through all of the values in find_all to get all solutions:
main =>
  foreach([S] in find_all(
    [Start],
    valid24(Start)))
    printf("%w%n", S)
  end.

$ picat 24.pi

[1,1,1,8]
[1,1,2,6]
[1,1,2,7]
[1,1,2,8]
# etc

Finding hard puzzles
Last Friday I realized I could do something more interesting with this. Once I have found a plan, I can apply further constraints to the plan, for example to find problems that can be solved with division:
valid24(Start, Plan) =>
  Start = new_list(4)
  , Start :: 1..8
  , increasing(Start)
  , solve(Start)
  , best_plan(Start, 4, Plan)
+ , member($(_ / _), Plan)
  .

In playing with this, though, I noticed something weird: there are some solutions that appear if I sort up but not down. For example, [3,3,4,5] appears in the solution set, but [5, 4, 3, 3] doesn't appear if I replace increasing with decreasing.
As far as I can tell, this is because Picat only finds one best plan, and [5, 4, 3, 3] has two solutions: 4*(5-3/3) and 3*(5+4)-3. best_plan is a deterministic operator, so Picat commits to the first best plan it finds. So if it finds 3*(5+4)-3 first, it sees that the solution doesn't contain a division, throws [5, 4, 3, 3] away as a candidate, and moves on to the next puzzle.
There's a couple ways we can fix this. We could replace best_plan with best_plan_nondet, which can backtrack to find new plans (at the cost of an enormous number of duplicates). Or we could modify our final to only accept plans with a division: 
% Hypothetical change
final([N]) =>
+ member($(_ / _), current_plan()),
  N =:= 24.

My favorite "fix" is to ask another question entirely. While I was looking for puzzles that can be solved with division, what I actually want is puzzles that must be solved with division. What if I rejected any puzzle that has a solution without division?
+ plan_with_no_div(S, P) => best_plan_nondet(S, 4, P), not member($(_ / _), P).

valid24(Start, Plan) =>
  Start = new_list(4)
  , Start :: 1..8
  , increasing(Start)
  , solve(Start)
  , best_plan(Start, 4, Plan)
- , member($(_ / _), Plan)
+ , not plan_with_no_div(Start, _)
  .

The new line's a bit tricky. plan_with_div nondeterministically finds a plan, and then fails if the plan contains a division.³ Since I used best_plan_nondet, it can backtrack from there and find a new plan. This means plan_with_no_div only fails if not such plan exists. And in valid24, we only succeed if plan_with_no_div fails, guaranteeing that the only existing plans use division. Since this doesn't depend on the plan found via best_plan, it doesn't matter how the values in Start are arranged, this will not miss any valid puzzles.
Aside for my logic book readers
The new clause is equivalent to !(some p: Plan(p) && !(div in p)). Applying the simplifications we learned:

!(some p: Plan(p) && !(div in p)) (init)
all p: !(plan(p) && !(div in p)) (all/some duality)
all p: !plan(p) || div in p) (De Morgan's law)
all p: plan(p) => div in p (implication definition)

Which more obviously means "if P is a valid plan, then it contains a division".
Back to finding hard puzzles
Anyway, with not plan_with_no_div, we are filtering puzzles on the set of possible solutions, not just specific solutions. And this gives me an idea: what if we find puzzles that have only one solution? 
different_plan(S, P) => best_plan_nondet(S, 4, P2), P2 != P.

valid24(Start, Plan) =>
+ , not different_plan(Start, Plan)

I tried this from 1..8 and got:
[1,2,7,7]
[1,3,4,6]
[1,6,6,8]
[3,3,8,8]

These happen to be some of the hardest 24 puzzles known, though not all of them. Note this is assuming that (X + Y) and (Y + X) are different solutions. If we say they're the same (by appending writing A = $(X + Y), X <= Y in our action) then we got a lot more puzzles, many of which are considered "easy". Other "hard" things we can look for include plans that require fractions:
plan_with_no_fractions(S, P) => 
  best_plan_nondet(S, 4, P)
  , not(
    member(X, P),
    round(apply(X)) =\= X
  ).

% insert `not plan...` in valid24 as usual

Finally, we could try seeing if a negative number is required:
plan_with_no_negatives(S, P) => 
  best_plan_nondet(S, 4, P)
  , not(
    member(X, P),
    apply(X) < 0
  ).

Interestingly this one returns no solutions, so you are never required to construct a negative number as part of a standard 24 puzzle.

The code below is different than old book version, as it uses more fancy logic programming features that aren't good in learning material. ↩

increasing is a constraint predicate. We could alternatively write sorted, which is a Picat logical predicate and must be placed after solve. There doesn't seem to be any efficiency gains either way. ↩

I don't know what the standard is in Picat, but in Prolog, the convention is to use \+ instead of not. They mean the same thing, so I'm using not because it's clearer to non-LPers. ↩

            If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.

    Read more:

                Knights, Puzzles, and Hypermodels
                I, being a huge nerd, am a fan of logic puzzles. One of the most famous ones is "Knights and Knaves": you have a bunch of statements from people, where...

                Solving a math problem with planner programming
                More opportunities to mess with exotic technology

Don't miss what's next. Subscribe to Computer Things:

Start the conversation: