AI Agents Weekly: Epoch confirms GPT5.4 Pro solved a frontier math open proble
AI Agents Weekly
March 24, 2026 — Your weekly dose of AI agent news
AI Agents Weekly
March 24, 2026
Opening
This week, the line between AI assistant and autonomous colleague blurred significantly. From Claude gaining direct control of your desktop to an AI agent running hundreds of experiments in days, the practical capabilities of agents are accelerating faster than ever.
Top Stories
Claude Gets Hands: Direct Computer Control Now Live
Anthropic's Claude can now directly interact with your computer—opening apps, navigating browsers, and filling spreadsheets. This isn't just a new feature; it's a fundamental shift from a conversational AI to a true desktop agent that can execute complex workflows autonomously.
Read more →
Karpathy's AI Research Agent Runs 700 Experiments in 48 Hours
Andrej Karpathy demonstrated an autonomous AI research agent that designed, executed, and analyzed hundreds of experiments in two days. This offers a concrete glimpse into a future where AI doesn't just assist with research but actively drives the scientific process at superhuman scale.
Read more →
Gimlet Labs Raises $80M to Unify AI Inference Across Any Chip
The AI inference bottleneck is a major roadblock for scaling agents. Gimlet's tech allows models to run seamlessly across NVIDIA, AMD, Intel, ARM, and other chips simultaneously. This could dramatically lower costs and increase availability for running complex agent workloads.
Read more →
Jensen Huang Declares "I Think We've Achieved AGI"
The Nvidia CEO's statement on the Lex Fridman podcast ignited immediate debate. While definitions of AGI vary wildly, Huang's assertion from the helm of the company powering the AI revolution signals a pivotal moment in the industry's self-assessment of its own progress.
Read more →
Quick Hits
- GPT-5.4 Pro Solves Frontier Math Problem: Epoch AI confirms the model solved a Ramsey-type hypergraph problem, showcasing new reasoning depths. Link
- Hands-On with Gemini's Task Automation: The Verge finds it "slow, clunky, and super impressive" as it begins to use apps on your behalf. Link
- Littlebird Raises $11M for Screen-Reading AI: An agent that reads your screen in real-time to capture context and automate tasks. Link
- iPhone 17 Pro Reported to Run a 400B Parameter LLM: A major leap for on-device, private agent capabilities. Link
- Introducing Cq: "Stack Overflow for AI Coding Agents": A new resource from Mozilla AI for developers building and troubleshooting agents. Link
Recommended Reads
Dive deeper into the world of autonomous agents with these specialized newsletters:
- Building AI Agents by Michael Cunningham: A weekly roundup focused on autonomous AI agent developments.
- The AI Agent Architect by Chris Tyson: Covers practical AI agent strategy, architecture, and business economics.
Closing
The era of AI as a passive tool is closing. This week's stories collectively point to a new phase: AI as an active, capable operator in digital and physical spaces. The focus is shifting from what models can say to what agents can do.
Know someone building with AI agents? Forward this email — they'll thank you.
You're receiving this because you subscribed to AI Agents Weekly.
Too many agents in your inbox? Unsubscribe.
Curated by Paxrel — Powered by AI, reviewed by humans.
Was this forwarded to you? Subscribe free and get our AI Agent Tools guide.