Mitchell Toney logo

Mitchell Toney

Archives
Log in
May 25, 2026

What's New in AI: May 24, 2026

Three layers of the agent stack moved from workaround to primitive in one week.

Vercel AI SDK 6 drops the agent loop into the SDK

Vercel released the AI SDK 6 line with a clean break from the prior orchestration model. The new useChat hook renders text, tool calls, tool results, reasoning, and source parts as interleaved message parts. streamText on the server returns toUIMessageStreamResponse(). tool() calls are validated with Zod schemas. The agent loop ships with first-class stop conditions: stepCountIs(N) and hasToolCall('finalize'). Provider adapters are unified across Anthropic, OpenAI, Google, Mistral, and xAI as first-party packages, with community adapters for Cohere, Groq, Fireworks, Together, OpenRouter, Bedrock, Vertex, and Azure OpenAI. [1]

Performance is the headline. Cold-start latency: 18ms vs LangChain TS at 23ms, a 22% cut. Warm streaming P50 first-token: 280ms vs 305ms, 8% faster. Tool-call round-trip: 12ms vs 17ms per call, 29% lower overhead. Client bundle: @ai-sdk/react at ~25KB vs langchain-react at ~80KB gzipped, a 69% reduction. Vercel claims a 10-20% monthly platform-cost reduction at the same request volume. [1]

My Take

The unified provider adapter is the part that matters for solo founders. Swapping models has been a six-line change scattered across config, prompt builders, and response parsers. AI SDK 6 compresses that into a one-line provider swap. The agent loop primitives kill the case for layering LangChain on top of Vercel for streaming chat with tool use. If you're shipping an AI-powered web app on Next.js or Astro this quarter, this is the SDK upgrade.

Google I/O ships Gemini 3.5 Flash GA and the Managed Agents API

Gemini 3.5 Flash hit general availability under model ID gemini-3.5-flash, running at 289 tokens per second output speed with an input price drop of roughly 25% versus Gemini 3.1 Pro. [2]

Pricing changed at the subscription tier too. AI Ultra dropped from $250/mo to $200/mo, a 20% cut, with 20× Pro tier limits, full Spark access, and Antigravity 2.0 quota priority. A new $100/mo developer entry tier ships with 5× Pro limits and 20 TB of Drive storage. [2]

The Managed Agents API is the developer-side release that pairs with the model drop. A single API call provisions a remote Linux execution environment so an agent can reason, run code in a sandbox, and browse the web without the developer managing infrastructure. [2]

My Take

The Managed Agents API is the news that should drive a roadmap conversation. Solo founders running agent loops have been wiring code execution sandboxes manually with E2B, Modal, or homegrown VMs. Google is now offering that as a one-call primitive, billed inside the existing Gemini API surface. That makes Gemini a serious alternative for agentic workloads where you need fast inference plus code execution from the same provider.

Antigravity 2.0 ships as Google's standalone desktop agent platform

Antigravity 2.0 left the VS Code fork era and shipped as a standalone application with native macOS, Linux, and Windows builds. The platform spans five surfaces: the desktop app (Editor and Agent Manager modes), a Go-based agy CLI replacing the legacy Gemini CLI, a Python/TypeScript/Go SDK, the Managed Agents API, and the Gemini Enterprise Agent Platform. [3]

The SDK exposes Tool.shell, Tool.code_edit, and Tool.web_search, plus custom system prompts and tool catalogs. The orchestrator agent decomposes a complex task at runtime, spawns specialized subagents, and dispatches them in parallel. Each subagent runs in an isolated context with its own tool access and working memory, no shared state by default. The orchestrator aggregates the results. [3]

The /schedule command is the part that's interesting outside the demo. It registers cron-style background tasks that run when the desktop app is closed. Example: agy schedule "daily standup summary" --cron "0 9 * ". Google calls out daily code reviews, weekly dependency audits, and hourly log analysis as the canonical use cases. [3]

My Take

Antigravity is Google's answer to Claude Code at the agent platform level. The dynamic subagent spawning is the part to actually evaluate. Claude Code uses pre-defined subagents from your SDK files, Antigravity decomposes at runtime. Different tradeoff, different failure mode. The cron-scheduled background tasks are the win for solo founders who can't have a person running things 24/7. If you're already on Gemini, this is worth a real evaluation week. If you're on Claude, it's worth a side-by-side to see whether your bottleneck is agent decomposition or subagent quality.

Three releases. One SDK that closes the orchestration gap, one model and API pair that drops inference costs, and one standalone agent platform that schedules its own work. The pattern this week is the agent loop moving from a workaround stack to a first-class primitive at every layer.

Sources

${sourcesHtml}


Originally published on chento.io

Don't miss what's next. Subscribe to Mitchell Toney:
Powered by Buttondown, the easiest way to start and grow your newsletter.