Solving a math problem with planner programming
More opportunities to mess with exotic technology
The deadline for the logic book is coming up! I'm hoping to have it ready for early access by either the end of this week or early next week. During a break on Monday I saw this interesting problem on Math Stack Exchange:
Suppose that at the beginning there is a blank document, and a letter "a" is written in it. In the following steps, only the three functions of "select all", "copy" and "paste" can be used.
Find the minimum number of steps to reach at least 100,000 a's (each of the three operations of "select all", "copy" and "paste" is counted as one step). If the target number is not specified, and I want to get the exact amount of a, is there a general formula?
The first two answers look for analytic solutions. The last answer shares a C++ program that finds it via breadth-first search. I'll reproduce it here:
#include <iostream>
#include <queue>
enum Mode
{
SELECT,
COPY,
PASTE
};
struct Node
{
int noOfAs;
int steps;
int noOfAsCopied;
Mode mode;
};
int main()
{
std::queue<Node> q;
q.push({1, 0, 0, SELECT});
while (!q.empty())
{
Node n = q.front();
q.pop();
if (n.noOfAs >= 100000)
{
std::cout << n.steps << std::endl;
break;
}
switch (n.mode)
{
case SELECT:
q.push({n.noOfAs, n.steps + 1, n.noOfAsCopied, COPY});
break;
case COPY:
q.push({n.noOfAs, n.steps + 1, n.noOfAs, PASTE});
break;
case PASTE:
q.push({n.noOfAs, n.steps, n.noOfAsCopied, SELECT});
q.push({n.noOfAs + n.noOfAsCopied, n.steps + 1, n.noOfAsCopied, PASTE});
break;
}
}
return 0;
}
This is guaranteed to find a shortest possible solution due to a fun property of BFS: the distance of nodes to the origin never decreases. If you evaluate Node Y after Node X, then Y.dist >= X.dist, meaning that the first valid solution will be a shortest possible solution. I should make this into a logic book example!
This also has the drawback of preventing the use of an insight. We should be able to fuse the select and copy steps together, meaning instead of having three actions (select, copy, paste) we only need two (selectcopy, paste), where selectcopy takes twice as many steps as pasting.
But we can't make that optimization because it breaks monotonicity. We're now pushing a mix of n+1
and n+2
steps onto the queue, and there's no way to order things to guarantee all of the n+1
steps are searched before any n+2
step.
I thought I'd try to solve it with planning language instead, so we can get both the elegant solution and the optimization.
Planning
The rough idea of planning is that you provide an initial state, a set of actions, and a target, and the tool finds the shortest sequence of actions that reaches the target. I've written about it in-depth here and also a comparison of planning to model checking here. I like how some tough problems in imperative and functional paradigms become easy problems with planning.
This is all in Picat, by the way, which I've talked about more here and in the planning piece. I'll just be explaining the planning stuff specific to this problem.
import planner.
import util.
main =>
Init = $state(1, 0) % one a, nothing copied
, best_plan(Init, Plan, Cost)
, nl
, printf("Cost=%d%n", Cost)
, printf("Plan=%s%n", join([P[1]: P in Plan], " "))
.
We're storing the state of the system as two integers: the number of characters printed and the number of characters on our clipboard. Since we'll be fusing selects and copies, we don't need to also track the number of characters selected (unlike the C++).
final(state(A, _)) => A >= 100000.
action(state(A, Clipboard), To, Action, Cost) ?=>
NewA = A + Clipboard
, To = $state(NewA, Clipboard)
, Action = {"P", To}
, Cost = 1
.
The paste action just adds the clipboard to the character count. Because Picat is a research language it's a little weird with putting expressions inside structures. If we did $state(1 + 1)
it would store it as literally $state(1 + 1)
, not state(2)
.
Also you have to use dollar signs for definitions but no dollar signs for pattern matching inside a function definition. I have no idea why.
action(state(A, Clipboard), To, Action, Cost) ?=>
To = $state(A, A)
, Action = {"SC", To}
, Cost = 2
.
And that's it! That's the whole program. Running this gives us:
Cost=42
Plan=SC P P SC P P SC P P SC P P SC P P SC
P P SC P P SC P P SC P P P SC P P P
To find if there's a sequence that gets us exactly 100,000, we just need to make one change:
- final(state(A, _)) => A >= 100000.
+ final(state(A, _)) => A = 100000.
This returns a cost of 43.
On the other hand, I can't get it to find a path that makes exactly 100,001
characters, even with some optimizations. This is because the shortest path is over 9000 steps long! I haven't checked if the C++ BFS can find it.
Metaplanning
One reason planning fascinates me so much is that, if a problem is now easy, you can play around with it. Like if I wanted to add "delete a character" as a move, that's easy:
action(state(A, Clipboard), To, Action, Cost) ?=>
A > 0
, NewA = A - 1
, To = $state(NewA, Clipboard)
, Action = {"D", To}
, Cost = 1
.
This doesn't make exceeding or reaching 100,000 easier, but it makes reaching 100,001 take 47 steps instead of 9000.
With some tweaks, I can also ask questions like "what numbers does it make the most easier?" or "Do some numbers have more than one shortest path? Which number has the most?"
Planning is really cool.
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.
This is a great right-sized example problem to tackle, thanks for sharing it. Reading through this really made planning click for me, and made me realize the underlying sameness of planning, pathfinding, and model verification.
One note: "there's no way to order things to guarantee all of the n+1 steps are searched before any n+2 step" -- there is, you just need a priority queue instead of a FIFO queue. Then you are essentially using Dijkstra's algorithm to find your best plan. This seems to be essentially what picat is doing if I'm understanding the picat code correctly -- it is memoizing for each state seen and picking the minimum next cost at each step.