Evaluating LLMs on text adventures, what overproduction does to waiting
You get to watch me spend money on API credits as we learn which LLM models can play text adventures. We also get a striking insight into why overproduction is such a insidious waste.
Hello! I hope your week is going well.
New articles
Evaluating LLMs Playing Text Adventures
We taught the LLM to play text adventures (although we never said it learned) and we were curious how well different LLM models performed compared to each other. You can laugh as my wallet ends up $50 lighter on API credits, but damn it if we won't find out.
Full article (8–15 minute read): Evaluating LLMs Playing Text Adventures
Flashcard of the week
Waste is effort that does not benefit customers. The classic example is the way cars were built pre-ford: they were assembled in place, with workers fetching materials from storage and bringing it to the car to assemble the car. The act of moving materials from storage to the car is wasteful; it costs effort and it does not benefit the customer at all – the car is the same whether the material was transported to it or magically appeared near it.
Ford came up with a better way to do it. The moving assembly line is the idea that if the car sits on a conveyor belt and moves through part storage, the workers will never have to move parts from storage to car, because the car moves to storage instead. (By the way, I think a lot of people don't appreciate this is the real gain of the moving assembly line. It's not about division of labour or specialisation of workers, although that tended to come along with it.)
This illustrates a way of getting rid of some of the waste commonly called transportation. Other forms of waste include
- defects (if the thing is made faulty, it will have to be fixed and this costs effort that does not benefit the customer)
- extra processing (gold-plating things that do not need gold plating is effort that does not benefit the customer)
- inventory (managing stacks of things costs effort but does not benefit the customer)
- miscommunication (leads to defects or extra processing or uneconomical motion etc.)
- excess information (having to parse and digest stuff that is irrelevant costs effort but does not benefit the customer)
- waiting (standing around in expectation of something to happen is perhaps not effortful, but the person doing so is still being paid which costs the customer without benefiting them)
But the mother of all wastes, according to Taiichi Ohno, was overproduction: making more of the thing than the customer had ordered. This is incredibly common, and thus our flashcard.
What does overproduction do to the waste of waiting?
We can start to guess at this. Overproduction easily happens when we would otherwise have workers or equipment standing idle. Some manager wants to get their money's worth out of their capital. But what happens next?
- We get more inventory.
- We risk introducing defects because we are making products and not putting them in front of customers.
- We will add to our transportation burdens to get the products out of the way.
Put simply,
Overproduction converts waiting into other wastes.
The insidious part about this is that waiting is a fairly benign form of waste. It doesn't interact negatively with anything else, it just costs a little money, is all. But waiting is a very visible form of waste, whereas the others are not. So it's easy to think we are optimising and being efficient when we are in fact converting a nice waste (waiting) into expensive-but-hidden ones (inventory, defects, transportation, etc.)
Premium newsletter
A week ago the latest premium newsletter went out! It contained
- four great links;
- a description of a new board game that probably sucks;
- brief tips/reviews on subscriptions that are actually worth it;
- a long, wordy review of Gene Wolfe's Book of the New Sun; and
- a disappointed review of Fatherhood: A History of Love and Power.
Even if you're not interested in the above, nor any of the past issues (all of which you would get access to), maybe you think $2 per month (cancelable any time, no questions asked) is a fair sacrifice to support this newsletter and blog.
To upgrade, click the subscription link at the top of this newsletter and fill in your email again.
Your opinions
I cannot improve without feedback. Reply to this email to share your thoughts on any of the topics above, or anything else!Hello! I hope your week is going well.