What Karpathy's knowledge base guide left out

2026-04-09


Content entropy. Karpathy published a guide to building an LLM knowledge base: index files, backlinks, short summaries so agents can navigate without reading everything. Everyone's building one. What it doesn't cover is the thing that'll kill yours: AI defaults to adding, not subtracting. Every edit subtly expands. An extra qualification, a "for clarity" sentence, one more example that's basically the same as the first. Each one is reasonable but together they're noise. And it compounds – bloated documents become context for the next session, which produces slightly worse output, which nobody catches until the whole thing needs rebuilding.

Three defences that work: anti-verbosity rules in your system prompt, dedicated polish passes on anything that persists, and periodic audits with one question – would this help someone or overwhelm them? Tweet

Same failure mode, different system. Data infrastructure and AI context break the same way. A dropped click ID silently kills your attribution. A stale instruction in your CLAUDE.md (persistent context file)? It's the same thing: output degrades and nobody notices. The key is precision in the small details, not better tools. Tweet

Techie is live. Claude Code assumes you're a developer who wants to write code. Techie replaces that entire personality – assumes you're smart but non-technical, strips the jargon, keeps the compound memory (persistent context across sessions) that makes the tool worth using. It installs in one command. It took longer to ship than expected – not because of the build, but because of the testing, refinement and writing it up clearly enough that someone could actually follow it. That gap between "it works" and "it's ready to ship" is where the real hours go. Blog post · GitHub


On my radar

Second brains for AI. Karpathy's knowledge base gist has everyone building one. Obsidian vaults, graph databases, structured doc folders – they're all persistent layers agents can query across sessions. The question that keeps coming up: is a richly structured knowledge graph just a glorified CRM? Gist

AI training actually works. Harvard and INSEAD studied 515 startups. 12% more tasks completed, 1.9x higher revenue, 39.5% less external capital demanded. It's the first rigorous data I've seen on this that isn't self-reported. LinkedIn

Anthropic's Mythos. Benchmark numbers for an unreleased model leaked. The cybersecurity angle is the interesting part: AI-assisted vulnerability discovery at scale could force enterprises to spend heavily patching holes they've ignored for years. HN discussion


Don't miss what's next. Subscribe to Build Notes: