AI/TLDR logo

AI/TLDR

AI/TLDR Daily Digest — May 24, 2026

2026-05-24


Project Glasswing initial update banner from Anthropic
SECURITY   MAJOR 2026-05-22

Anthropic's Project Glasswing — ~50 Partners, 10,000+ Vulnerabilities Found, 90.6% True-Positive Rate

Anthropic's one-month report on its ~50-partner program to find software flaws before AI models can exploit them.

What is it?
Project Glasswing is Anthropic's coordinated effort with about 50 partners to use its Claude Mythos Preview model to find and fix security flaws in critical software ahead of attackers. This is the program's first public results, covering its opening month.

How does it work?
Partners run Claude Mythos against their own codebases and a sweep of open-source projects. An automated scan of 1,000+ projects flagged 6,202 high- or critical-severity issues; independent assessment confirmed 1,587 as valid at a 90.6% true-positive rate.

Why does it matter?
It is a concrete data point on whether frontier models can do useful defensive security at scale — 530 bugs have been reported to maintainers and 75 patched so far. Claude Security also enters public beta alongside a public vulnerability dashboard.

Who is it for?
Security researchers and open-source maintainers who can now try the workflow through the new Claude Security public beta.

Anthropic DETAILS →
DeepSeek social card on the V4-Pro API pricing documentation page
MODEL   MAJOR 2026-05-22

DeepSeek Makes Its 75% V4-Pro Discount Permanent — $0.435 Input, $0.87 Output per Million Tokens

DeepSeek makes its 75%-off V4-Pro API pricing permanent instead of letting the promo expire on May 31.

What is it?
DeepSeek-V4-Pro is the company's flagship API model. A 75% promotional discount was set to expire on May 31, 2026 — DeepSeek has now made the cut permanent, locking the price at one-quarter of the original sticker rate.

How does it work?
Cache-miss input drops to $0.435 per million tokens and output to $0.87 — both a quarter of the original $1.74 and $3.48 list prices. Cache-hit input falls further to $0.003625 per million, a ~90% reduction on cached reads.

Why does it matter?
For agentic and long-context workloads that re-read large prompts repeatedly, the cache-hit price reshapes the cost math. Teams get a low-cost, OpenAI-compatible API with no contract lock-in.

Who is it for?
API developers, agent builders, and cost-sensitive teams running high-volume inference.

DeepSeek DETAILS →
Cohere Command A+ announcement banner
MODEL   MAJOR 2026-05-20

Cohere Command A+ — 218B Sparse MoE, Apache 2.0, Runs on Two H100s

An open-weight 218B MoE that runs agentic, multimodal, multilingual workloads on as few as two H100 GPUs.

What is it?
Command A+ is Cohere's new open-weight model under an Apache 2.0 license, folding four earlier Command models — base, reasoning, vision, and translate — into a single mixture-of-experts model aimed at enterprise agents that organizations can self-host for data sovereignty.

How does it work?
218B total parameters but only 25B activate per token. A 4-bit W4A4 build fits on one B200 or two H100 GPUs with a 128K-token context window, and the model generates native citation grounding spans tying each claim back to its source.

Why does it matter?
Enterprises and public-sector teams that need data residency can run a capable agentic, vision-capable, 48-language model on-prem without per-token API fees. Apache 2.0 lets them modify and redeploy the weights freely.

Who is it for?
Enterprise and public-sector AI teams that need data residency, multimodal capability, and a permissive license.

Cohere DETAILS →
Introducing Managed Agents in the Gemini API — Google announcement key art
TOOL   MAJOR 2026-05-19

Google Ships Managed Agents in the Gemini API — One Call Spins Up an Ephemeral Linux Sandbox

A single Gemini API call now provisions an ephemeral Linux sandbox with an Antigravity agent inside it.

What is it?
Managed Agents is a new Gemini API capability that lets developers spin up production-grade AI agents without owning sandbox infrastructure — each call returns an agent running in an isolated, ephemeral Linux environment where it can reason, call tools, execute code, and browse the web.

How does it work?
Agents are defined with two markdown files: AGENTS.md for role and instructions, and SKILL.md files for individual capabilities. Sessions can be resumed across follow-up calls with preserved state.

Why does it matter?
Google now offers managed agent infrastructure as a Gemini API primitive, directly mirroring Anthropic's Claude Managed Agents. Defining behavior in markdown instead of orchestration code lowers the bar considerably.

Who is it for?
Gemini API developers and agent platform teams comparing managed-agent offerings across labs.

Google DETAILS →
Microsoft Security blog banner for RAMPART and Clarity agentic AI safety tools
SECURITY   MAJOR 2026-05-20

Microsoft Open-Sources RAMPART and Clarity — Pytest-Native Agent Red-Teaming + Pre-Code Design Review

Two open-source tools that bake agent red-teaming and design review into the development workflow.

What is it?
RAMPART is a pytest-native framework for writing repeatable safety and security tests against AI agents. Clarity is a planning assistant that interrogates a system design before any code is written, both MIT-licensed.

How does it work?
RAMPART is built on PyRIT — developers write standard pytest cases for threat scenarios and a thin adapter gates CI/CD pass/fail. Clarity runs as a desktop app or web UI with multiple AI "thinkers" probing the design from security, human-factors, and operational angles.

Why does it matter?
Agent safety testing has been largely ad hoc. Putting red-team checks into pytest and CI lets teams catch prompt-injection and tool-misuse regressions automatically on every change, while Clarity pushes failure analysis upstream where fixes are cheaper.

Who is it for?
AI agent developers and security teams that want to add repeatable, CI-integrated safety tests without building the framework themselves.

Microsoft DETAILS →
Close-up of the Spotify app icon on a smartphone screen.
ECOSYSTEM   MAJOR 2026-05-21

Spotify + UMG Sign Licensing Deal for AI-Generated Fan Covers and Remixes

Spotify Premium will let you generate AI covers and remixes of opt-in UMG artists' songs — and the artists get paid for it.

What is it?
Recorded-music and publishing licensing agreements between Spotify and Universal Music Group that authorize Spotify to build an AI tool letting Premium subscribers generate covers and remixes of participating UMG artists' songs — the first major-label deal of its kind.

How does it work?
Participation is opt-in for artists and songwriters. Generated covers and remixes will be exclusive to Premium subscribers as a paid add-on, with UMG rights holders earning revenue on top of standard streaming payouts.

Why does it matter?
It is a direct response to the Suno and Udio lawsuits — UMG is signing licensing deals upfront rather than chasing AI music tools through the courts. If Warner and Sony follow, sanctioned AI music creation becomes a real category inside streaming platforms.

Who is it for?
Spotify Premium subscribers and AI-music tool builders watching the licensing precedent set by this first major-label agreement.

Spotify DETAILS →
KanBots GitHub repository card
TOOL   NOTABLE 2026-05-22

KanBots — Open-Source Desktop App Runs Claude Code and Codex Agents in Parallel Across a Kanban Board

A local kanban board where every card is a Claude Code or Codex agent working in its own git worktree.

What is it?
KanBots is an open-source (MIT) desktop app for macOS, Linux, and Windows that turns a kanban board into a dispatcher for AI coding agents. Each card on the board becomes a task handled by a Claude Code or Codex CLI agent.

How does it work?
Each agent runs in its own git worktree so parallel tasks don't collide on the same files. The board updates live as work progresses, an autopilot mode splits work automatically, and cost analytics track spend per agent.

Why does it matter?
Coordinating several coding agents is fiddly without tooling. KanBots gives a single view to dispatch, watch, and review them while git worktrees keep each agent's changes isolated until you're ready to merge.

Who is it for?
Developers running multiple coding agents who want a visual control surface with per-agent cost tracking.

KanBots DETAILS →

All releases at ai-tldr.dev

Simple explanations • No jargon • Updated daily


Don't miss what's next. Subscribe to AI/TLDR: