What's New in AI: May 5, 2026

Originally published on chento.io


            
        May 6, 2026
    
    
What's New in AI: May 5, 2026


        CLI fixes, persistent agent control, and AWS-native frontier models. Three founder-relevant developer-tool stories from this week.
Claude Code 2.1.128 quietly fixes the things that actually break
Anthropic shipped Claude Code 2.1.128 on May 4, 2026. [1] Three changes worth highlighting.
First, plugin loading now accepts .zip archives directly. That makes distribution simpler, since you no longer need to walk users through a clone-and-symlink dance to get a custom plugin into their Claude Code install.
Second, the CLI now summarizes re-announced MCP tools by server prefix instead of dumping the full tool list every time the connection refreshes. If you have watched your terminal scroll for a full second every time MCP reconnects, this is the one. Less noise in the transcript means less noise in the model's working context too.
Third, the 10MB stdin crash loop is fixed. Piping a large file into Claude Code on a long-running task no longer kills the session. This was the kind of paper cut that costs you twenty minutes when you finally hit it, and now it doesn't.
My Take
For founders running terminal-first agentic workflows, this is the kind of release that doesn't ship anything new but unblocks three real failure modes you would otherwise have to babysit around. Boring releases that remove paper cuts compound over time.
Codex 0.128.0 makes long agent runs survivable
OpenAI shipped Codex 0.128.0 on April 30, 2026. [2] The headline change is persisted /goal workflows. Long-running tasks can now be paused, resumed, and cleared from the TUI. Until now, the answer to "what happens if my agent run is mid-refactor and I need to step away" was: hope. The new flow gives you actual controls.
Two related additions matter. External agent session import means you can hand off work in progress between Codex contexts without losing the thread. And MultiAgentV2 threads now respect explicit execution limits like thread caps and wall-time management, so a runaway sub-agent can't quietly burn through your quota or your patience while you are not watching.
My Take
For anyone delegating multi-step refactors or feature builds to autonomous agents, this is the difference between trusting the system overnight and not. Long-horizon work is only useful if it is recoverable when something goes sideways.
OpenAI frontier models land natively on Amazon Bedrock
OpenAI announced limited preview availability of GPT-5.5, Codex on AWS, and Bedrock Managed Agents powered by OpenAI on April 28, 2026. [3] The mechanical change is small: instead of routing requests out to api.openai.com, you call the same models through Bedrock, inside your AWS VPC, billed against your AWS account.
The practical change is bigger. For any team already running on AWS, which is most enterprise-adjacent startups, this collapses three categories of friction at once: identity (use IAM, not a separate OpenAI key), data residency (requests stay in your AWS region), and billing (one invoice instead of two). Bedrock Managed Agents extends this to multi-step agentic flows with built-in security and governance hooks.
My Take
If you are a solo founder selling into companies that have an "AWS-only" data policy, this is the shortest path you have ever had to "yes we support enterprise."
These three releases have nothing in common except that they are all about the unglamorous middle of the stack: CLIs, session management, and platform integration. That is where the leverage lives for solo founders. The model layer is a commodity decision now. The harness around it is what actually ships your product.
Sources
${sourcesHtml}

Originally published on chento.io
    

                                Don't miss what's next. Subscribe to Mitchell Toney:
                            
                        
            Email address (required)