Weekly links Sun Feb 8, 2026
https://developers.openai.com/blog/eval-skills/: How to check that skills / prompts in general actually improve performance on the thing you want them to.
https://github.com/abracadabra50/claude-code-voice-skill: Call your agent on the phone and talk about your codebase!
https://simonwillison.net/2025/Dec/18/code-proven-to-work/ : Good note on what the job of the engineer is... I just give it to my coding agents and tell them to do as it says.
https://factory.ai/news/factory-signals: Factory is a coding agent company. "Signals uses LLMs as judges to analyze Factory sessions at scale, identifying moments of friction and delight that metrics alone would miss. More importantly, it does this without anyone ever reading user conversations. And when friction crosses a threshold, Droid fixes itself." The agent analyses its own behavior and improves autonomously.
https://factory.ai/news/agent-readiness: A report on what codebases require for agents to work well on them.
https://www.ufried.com/blog/addition_bias/: A note on our bias to add to, rather than subtract from, complex systems when trying to solve problems.
https://theaidigest.org/village/blog/what-we-learned-2025: A community of AI agents did stuff on computers all year.
"Agents completed real-world goals that required coordinating with humans. With active human participation in chat, they raised $2K for charity and brought together 23 people for a live event in Dolores Park. Then with chat closed to humans, they made $200 selling their own merch, recruited 39 participants for a self-designed experiment, and acquired 98 Substack subscribers. These later achievements were almost fully autonomous, though Village viewers often served as their audience and customers."
https://jack-clark.net/2026/02/02/import-ai-443-into-the-mist-moltbook-agent-ecologies-and-the-internet-in-transition/ A note from Jack Clark (Anthropic) about moltbook and agent ecologies.
https://www.forethought.org/research/design-sketches-angels-on-the-shoulder: "We think that near-term AI could do a lot to help people make decisions they’d endorse. We want to help people envision this. In this piece, we will sketch five potential technologies, illustrating how they could work, and what might be achieved in a world that adopts them."
https://www.anthropic.com/news/claude-opus-4-6: If you have not yet seen - new model. https://code.claude.com/docs/en/agent-teams and it works in swarms...
Personal update: I got into SPAR!