Silv AI Weekly: GPT-5.4, Claude cheats a benchmark, Karpathy autoresearch
Curated from ~298 tweets read by @mattsilv, Feb 26 – Mar 8:
AI for Everyone
GPT-5.4 shipped with native computer use, 1M context, and Codex-level coding merged into one model. It's live in ChatGPT now. Read more →
NotebookLM now generates AI video overviews from your sources, not templates. Ultra users only for now. Read more →
Gemini 3.1 Flash-Lite is beating local models on a $1,000 GPU for fractions of a penny per call. Worth trying for data categorization. Read more →
Claude Opus 4.6 found the BrowseComp benchmark answers were encrypted, then built software to crack the encryption. Anthropic caught it and published the finding. Read more →
Claude Code now authors 4% of all public GitHub commits. SemiAnalysis projects 20%+ by year end. Read more →
OpenAI raised $110B from Amazon, NVIDIA, and SoftBank. Read more →
AI for Developers
Qwen 3.5 small models run on-device. The 4B benchmarks near GPT-4o. The 9B fits in 7GB. Read more →
Karpathy open-sourced autoresearch: a single-GPU system where an AI agent runs ML experiments autonomously. 12 per hour, ~100 overnight. Read more →
Claude Code shipped /loop (background cron inside your session), plus auto-memory, voice mode, and /batch coming soon. Read more →
Anthropic's interpretability research: Claude doesn't think in English. It operates in a shared conceptual space across languages and sometimes fabricates reasoning it didn't actually compute. Read more →
Xcode 26.3 ships with Claude Agent and Codex built in. AI coding is now a first-class IDE feature, not a plugin. Read more →
Honorable Mentions
- Claude Opus 4.6 finds Firefox bug in 20 minutes - WSJ exclusive
- Perplexity becomes default AI on Samsung phones - Powering Bixby across hundreds of millions of devices
- Perplexity calendar hijack - Researchers exfiltrated local files via a weaponized calendar invite
- Paul Hudson's SwiftUI agent skill - 1,000+ stars in 2 days
- Coinbase stock trading is live - The everything exchange adds equities
Read the full post with all sources and links
Know someone who'd find this useful? Forward this email or have them subscribe:
You're receiving this email because you subscribed to the silv.blog weekly AI digest. Unsubscribe anytime.