Thanks for all the agents
Decade of agents, lots of models
Hi all,
The last few weeks have seen a ton of activity in the agent space. Good thing it's Thanksgiving this week so we can actually sit down and catch up on everything. Enjoy the turkey and all the quality links below.
The hot take making headlines is Andrej Karpathy's claim that AI agents will take a decade to work. If that's the case, "the year of AI agents" could turn into the decade of AI agents. I think that sounds right, though at the speed things are progressing we may not need the whole decade.
Several models have recently been released with an explicit focus on agent workloads. Anthropic has released Opus 4.5, Google launched Gemini 3, and IBM delivered Granite 4.0 Nano. Smaller labs have also made waves: Moonshot dropped Kimi K2 Thinking, MiniMax launched M2, and Jan released Jan-v2-VL. The model landscape is much more favorable to agent applications now than only a few months ago.
Finally, a couple AGT NYC updates. First, there are more events coming up, with the next one in early December around the AI Summit. Second, the AGT NYC site is currently being transformed into a proper hub of agent resources, with content from past newsletters and more. Stay tuned!
Events
- Dec 11: AGT NYC: Breakfast with Agents
- Dec 5: Agents in Healthcare: Where Automation Meets Clinical Reality
- Dec 8: ClickHouse x LibreChat: Agentic AI Meetup in NYC
- Dec 9: Agents & APIs NYC Developer Meetup
- Dec 10: NYC AI Builders Meetup: Building and Improving Agents with Arize AI & CrewAI
- Dec 10: The AI Summit New York
- Dec 12: Agentic AI Hackathon
- Dec 12: Agentic AI Meetup with Google Cloud
News
- OpenAI cofounder Andrej Karpathy says it will take a decade before AI agents actually work
- Companies Begin to See a Return on AI Agents
- AI agents still majorly struggle with real-world work
- Securing agentic commerce: helping AI Agents transact with Visa and Mastercard
- Microsoft Agent 365 lets businesses manage AI agents like they do people
- AI agents are getting a lobby group in DC
- Oracle unveils AI Agent Marketplace
- Introducing GitHub Agent HQ
- Amazon sues Perplexity over 'agentic' shopping tool
- China's Top Companies Focus on AI Agents as Next Battleground
- Microsoft researchers tried to manipulate AI agents
- Amazon Is Using Specialized AI Agents for Deep Bug Hunting
- Introducing Aardvark: OpenAI’s agentic security researcher
- Introducing OpenAI AgentKit
- SIMA 2: An Agent that Plays, Reasons, and Learns With You in Virtual 3D Worlds
- Microsoft Launches Magentic Marketplace for AI Agents
- Introducing Claude Agent Skills and Code on the web
- Introducing advanced tool use on the Claude Developer Platform
- Pioneering agentic autofill for credential access
A few themes emerge from these headlines: Big Tech companies are launching an agent marketplace or headquarters as a new product, frontier labs are continuing to push into the application layer, Anthropic has almost completed its transformation of Claude Code into a general purpose agent (while OpenAI isn't far behind), payments and authentication for agents is rapidly evolving, and the general feeling on agents is mixed.
Fundraising
- n8n ($180M) - workflow automation
- General Intuition ($134M) - agent worlds
- Wonderful ($100M) - customer service
- Parallel ($100M) - agent search
- Tenzai ($75M) - penetration testing
- Serval ($47M) - IT service management
- MAI ($25M) - performance marketing
- Dedalus Labs ($11M) - agent development
- Natural ($9.8M) - agent payments
- Anchor Browser ($6M) - browser for agents
- Bricklayer AI ($5M) - security operations
There's over $500M raised in just the recent rounds above! It's interesting to see a split between agent-specific products and vertical products with agents, they're both attracting similar amounts of capital. Once the low hanging fruit is picked off, I'd be curious what comes next.
Articles
- You Should Write an Agent by Thomas Ptacek
- Building production-ready agentic systems by Shopify
- Agent Design Is Still Hard by Armin Ronacher
- The Messy World of “Deterministic Agents” by Erik Dunteman
- Agents 2.0: From Shallow Loops to Deep Agents by Philipp Schmid
- The state of AI in 2025: Agents, innovation, and transformation by McKinsey
- AI Agents Are Winning Hearts and Wallets by G2
- Context Engineering: The Foundation for Reliable AI Agents in The New Stack
- The agentic commerce opportunity by McKinsey
- Designing agentic loops by Simon Willison
- State of AI Report 2025 by Air Street Capital
Lots of good quick reads above. More and more practical and grounded views on agents are making their way around the industry. Hype is being replaced by acceptance of a longer road ahead, which I think is a good sign overall - we'll likely still be talking about agents a year from now.
Projects
- x402 - agent-native payments
- agent-lightning - RL for any agent
- Volcano SDK - TypeScript SDK for Multi-Provider AI Agents
- strix - agents for penetration testing
- rogue - agent testing made easy
- LangChain 1.0 - simpler, standardized, streamlined
- LangSmith No-code Agent Builder
The attention has shifted in the agent space from building yet another framework to working at higher abstraction levels. This could be a good development - the surviving frameworks are likely going to be the foundation of the next wave of agent projects.
Learning
- Kaggle: Introduction to Agents
- Kaggle: Prototype to Production
- How to Use Claude Code for Everyday Tasks
- How to Build Your Own Agentic AI System Using CrewAI
- How agents can use filesystems for context engineering
- OpenAI: Self-Evolving Agents
- Galileo: Mastering Multi-Agent Systems
- Anthropic: Building the future of agents with Claude (22:10)
- Anthropic: Building more effective AI agents (18:57)
- Building agents with Amazon Bedrock AgentCore (59:53)
- Reduce CAPTCHAs for AI agents browsing the web with Web Bot Auth (Preview) in Amazon Bedrock AgentCore Browser
Looks like the agent curriculum is starting to take shape. The basics are getting more polished while the advanced topics are getting more practical. It's probably a good time to start consolidating individual resources into a proper roadmap.
Papers
- Fundamentals of Building Autonomous LLM Agents
- Scaling Agent Learning via Experience Synthesis
- Solving a Million-Step LLM Task with Zero Errors
- Agent Learning via Early Experience
- AgentFold: Long-Horizon Web Agents with Proactive Context Management
- AgentEvolver: Towards Efficient Self-Evolving Agent System
- The Era of Agentic Organization
- Multi-Agent System for Simulating and Analyzing Marketing and Consumer Behavior
- Demand, Supply, and Market Design with AI Agents
- AI Agents for Economic Research
- Building the Web for Agents
- Agent-in-the-Loop
Lots of exciting research here, particularly the papers about solving long-horizon agent execution. Fixing one of the deeper problems of agent systems is important, but when combined with autonomy improvements and environments built for agents, a really productive path ahead emerges.
If you know anybody else that would be interested in AGT NYC, please refer them to agtnyc.com to sign up.
Cheers,
Ivan