dbreunig

Subscribe
Archives
August 4, 2025

Why the Term "Context Engineering" Matters

Exploring training methods, questioning the limits of the 'bitter lesson', and foiling LLMs with cat facts.

Hi all,

As the EU AI Act’s August 2nd deadline approached, AI labs furiously launched models. There were new models from Alibaba, Cerebras, Google, Moonshot, Z.ai, and more (with one notable absence).

We only dove into one – and that was because Moonshot wrote a great paper.

Most of July was spent discussing context, metrics, culture, and creativity.


This Month’s Explainer: Exploring the Art & Science of Model Building

Many tools laid out neatly.
Moonshine found and imagined tens of thousands of tools to help train Kimi K2.

As we’ve discussed before, I love a good model technical paper. The paper written by the Moonshot team, detailing how they built the Kimi K2 model, is especially excellent. It introduces a few novel methods, which we spent some time exploring this month.

If you want to better understand how models are built, these posts are good, approachable resources. They introduce essential concepts (such as the difference between pre-training and post-training), provide concrete examples, and explain why these challenges are particularly complex.

  1. Moonshot rephrased text data to prevent over-fitting: Moonshot rewrote text in different styles and tones – while preserving the information – in order to keep Kimi K2 from over-fitting to specific phrases.

  2. Moonshot invented a ton of tools and agents, then pretended to use them to improve tool-use abilities: Tool use still isn’t perfect; especially with small models. Kimi K2 excels at tool-use because Moonshot synthesized usage of tools and agents, then trained the model on that data.

  3. Moonshot used rough rubrics to help LLMs score qualitative data: We can’t perfectly assess qualitative things, like writing ability or aesthetic choices. But we can do this roughly, if we focus on a few traits we can define well. Moonshot did this with Kimi K2, and beats giant, closed models with its writing abilities.


A Recent Presentation: Why “Context Engineering” Matters

Last month, I gave a talk at a LangChain hosted event, along with Lance Martin, arguing why the term “context engineering” is not only here to stay, but signals the emergence of a new field, culture, and community.

A quote from Steward Brand: "If you want to know where the future is being made, look for where language is being invented and lawyers are congregating."
The stage-setting slide from my presentation. It might be my favorite quote. Every time I cite it, I think of this scene.

Recent Writing

Cat Facts Cause Context Confusion: One of the ways contexts fail is context confusion, “when superfluous content in the context is used by the model to generate a low-quality response.” A great example of this is CatAttack, an LLM attack that uses seemingly harmless phrases to confuse language models.

Adding cat facts or off topic reminders can cause models to flail.
Adding the red phrases caused the model to answer poorly.

While this is an adversarial attack, it nicely illustrates how stray details can affect your results – just like our post from a few months ago involving Eagles fans.

AI Creates the Problems it Solves: A look at how many interactions are caught in an AI arms race. In the job market and sales fields, the adoption of AI tools has dramatically increased the value of human connections.

Delegation is the AI Metric that Matters: It’s easy to follow the benchmarks, but the best way to measure AI’s impact is note how often experts delegate key tasks to AI.

A Comedy Writer on How AI Changes Her Field: Kenneth Cukier and Eleanor Warnock’s new newsletter, Chief Word Officer, interviews a comedy writer, Madeleine Brettingham. Her thoughts about creativity in the time of AI really resonated with me.

I didn’t dwell on it much in the post above, but I think about this line from Brettingham nearly every day:

The bar for originality and authenticity has been raised for everything. So I think LLMs will lower the bar for entry and raise the bar for quality.

Does the Bitter Lesson Have Limits?: The “bitter lesson” is an idea coined by Rich Sutton, stating, “general methods that leverage computation are ultimately the most effective, and by a large margin.” People have been citing the bitter lesson frequently, these days. But the bitter lesson has limits.

Google searches for "the bitter lesson" are increasing lately.
More and more people are interested in the “bitter lesson” these days.

Art Break

Bonhams held an auction titled, “A Century of Sports Photography.” Lot 84 was the unattributed image below:

An antique photograph of a boxing match, let by a matrix of early artificial lighting.
Jul 19th, 1919. Paris.

From their description:

The audience at the Cirque de Paris has waited five years for this fight. Their champion, Georges Carpentier – decorated and back from the Great War – finally faces Dick Smith, the British light heavyweight champion. Despite the endurance of his rugged opponent, Carpentier wins by knockout in the 8th round. The newspaper L'Auto celebrates the return of its "great boxing star" and sees him ready for new battles. The following year, Carpentier will become the first Frenchman to win a world boxing title.

Saying Carpentier lived an interesting life is a tremendous understatement. Just as he achieved his heavyweight championship, World War I kicked off. He joined the French Air Force and become a decorated pilot. His return to boxing was hugely anticipated and set off a craze for the sport in France during his reign.

He retired from boxing 8 years later, after which he become a vaudeville performer, an author, a silent film and “talkies” actor, then ran a high-end bar in Paris.

Until next month,

Drew

Read more →

  • Jun 27, 2025

    Context for Apps, Prompts for Chat

    Hi all, Lots of talk about contexts this month, as the entire AI ecosystem seemed to suddenly realize that “prompting” and “context management” are two very...

    Read article →
Don't miss what's next. Subscribe to dbreunig:
GitHub Bluesky
Powered by Buttondown, the easiest way to start and grow your newsletter.