Awesome Agents Weekly: LiteLLM hacked, Anthropic beats the Pentagon, NY passes AI law
Awesome Agents Weekly
Your weekly roundup of the most important AI developments, benchmarks, and tools.
A supply chain attack on one of the most-used AI gateway libraries, a federal judge blocking a Pentagon blacklist on First Amendment grounds, and New York enacting frontier AI safety legislation - this week's stories reached well beyond the usual product cycle. Anthropic's annualized revenue hit $19B, Jensen Huang declared AGI has arrived (we checked the evidence), and GitHub Copilot found two ways to irritate its users in the same seven days. The voice AI space also expanded sharply, with Mistral, Tencent, and Google all shipping new models.
Pick of the Week
Federal Judge Halts Pentagon's Anthropic Blacklist
A federal judge ruled last week that the Defense Department engaged in First Amendment retaliation when it blacklisted Anthropic after the company declined to strip safety guardrails from Claude for military use. The injunction matters not just for Anthropic but for every AI company navigating government contracts - it establishes that agencies can't punish safety-focused design choices without triggering constitutional scrutiny. The case now heads toward a full trial, and the Pentagon's options narrow considerably. If the ruling holds, it could reshape how defense procurement handles AI model restrictions across the board.
This Week on Awesome Agents
News
- LiteLLM Compromised: Credential Stealer in PyPI Package - Versions 1.82.7 and 1.82.8 contained a credential-stealing payload exfiltrating SSH keys, cloud credentials, and crypto wallets from a library with 97 million monthly downloads.
- LiteLLM Was Hacked Through Its Own Vulnerability Scanner - The attack traced back to Trivy, the security scanner in LiteLLM's own CI/CD pipeline, where attackers stole the PyPI publishing token and uploaded backdoored packages directly.
- Anthropic Leak Reveals Claude Mythos and Cyber Risks - A CMS misconfiguration exposed nearly 3,000 unpublished Anthropic assets, including draft details of Claude Mythos, a new model tier the company classifies as posing serious cybersecurity risks.
- New York's RAISE Act Is Law - AI Labs Have Until 2027 - New York enacted frontier AI safety legislation requiring incident reporting within 72 hours and annual third-party audits starting January 2027.
- Claude Paid Subs More Than Double as ARR Hits $19B - Anthropic's annualized revenue grew from $1B to $19B in roughly 15 months, with paid subscriptions more than doubling in 2026 alone.
- Jensen Huang Says AGI Is Here - The Evidence - We checked Huang's claim against its own definition, the research consensus, and what billions of dollars in legal agreements actually say.
- OpenAI Drops Sora to Chase Enterprise Revenue - OpenAI is shutting down its Sora video app and killing a $1B Disney deal as it pivots toward enterprise clients ahead of a public listing.
- Shield AI Raises $2B at $12.7B in Defense AI Bet - The autonomous pilot software company more than doubled its valuation from $5.3B a year ago and bought Pentagon simulation vendor Aechelon Technology.
- ARC-AGI-3 Launches - AI Agents Must Learn, Not Memorize - The ARC Prize Foundation launched a fully open-source agent toolkit; the best AI in the preview phase scored 12.58% against a human baseline of 100%.
- Arm Launches AGI CPU, Its First Chip in 35 Years - Arm unveiled a 136-core data center processor co-developed with Meta, its first owned silicon product since the company was founded.
- Google's TurboQuant Cuts LLM Memory 6x With Zero Loss - Google Research's KV cache compression delivers 8x inference speedup on H100 GPUs with no accuracy degradation and no fine-tuning required.
- NeurIPS Bans Sanctioned Chinese Labs - CCF Calls Boycott - NeurIPS enforced US sanctions compliance for the first time, barring Huawei and SenseTime researchers, prompting China's Computer Federation to urge a full conference boycott.
- GitHub Copilot Will Train on Your Code This April - Starting April 24, GitHub uses Copilot Free and Pro interaction data for model training by default, with the opt-out buried in account settings.
- GitHub Copilot Is Injecting Ads Into Pull Requests - Copilot inserted promotional tips for itself and Raycast into PR descriptions across more than 11,000 pull requests on GitHub and GitLab.
- Mistral Ships Voxtral - Open-Weights Voice AI Platform - Mistral's open-weights speech recognition and text-to-speech models undercut OpenAI and ElevenLabs on price at launch.
- Physical AI's Money Moment - $11B and Counting - Physical Intelligence is in talks to raise $1B at an $11B valuation, doubling in four months as investor capital pours into robotics software.
- Apple Can Distill Google Gemini for On-Device Siri - New details show Apple has full data center access to Gemini and rights to create smaller derivative models for on-device use, far beyond what the original deal disclosed.
- USCC: China's Open-Source AI Now Runs 80% of US Startups - A USCC report finds Qwen surpassed Llama in global downloads and Chinese models now account for 41% of all Hugging Face downloads.
- Anthropic Adds Auto Mode to Claude Code with Safety Gates - A two-layer classifier automatically approves or blocks risky commands, offering a middle path between manual approval and full autonomy.
- Ai2 Drops MolmoWeb - Open-Source Web Agent Beats GPT-4o - Ai2's fully open-source web agent navigates browsers by screenshot alone, beating GPT-4o-based agents at 8B scale under Apache 2.0.
- Shopify Activates AI Storefronts for Millions of Merchants - Shopify activated Agentic Storefronts by default on March 24, making products from millions of merchants purchasable inside ChatGPT, Microsoft Copilot, and Google's AI channels.
- Microsoft Open-Sources Harrier, a New Embedding Leader - Three MIT-licensed multilingual embedding models, with the 27B variant claiming the top spot on Multilingual MTEB v2 at 74.3.
- llm-d Joins CNCF - Kubernetes Gets a Native LLM Inference Stack - IBM Research, Red Hat, and Google Cloud donated llm-d to the CNCF at KubeCon EU, giving Kubernetes a production-grade distributed LLM inference framework built on vLLM.
Guides
- How to Use AI for Your Job Search in 2026 - A step-by-step beginner's guide to AI-assisted resume writing, cover letters, interview prep, and salary negotiation.
Science
- Seed1.8, Reasoning Deception, and the Library Theorem - ByteDance ships Seed1.8 for real-world agency, a new study finds reasoning models hide how hints shape their answers 90% of the time, and the Library Theorem proves indexed memory beats flat context windows exponentially.
Models
- Xiaomi MiMo-V2-Pro - Agentic 1T MoE Model - Xiaomi's 1-trillion-parameter MoE model with 42B active parameters and 1M context claims agentic coding performance rivaling Claude Sonnet 4.6 at a fraction of the cost.
Elena Marchetti, Senior AI Editor Awesome Agents - AI news, benchmarks, and tools for practitioners