Awesome Agents Weekly: LiteLLM compromised - the supply chain attack is here
Awesome Agents Weekly
Your weekly roundup of the most important AI developments, benchmarks, and tools.
This week had no quiet days. A credential-stealing payload shipped inside the most-used LLM routing library in Python. A Delaware judge cited ChatGPT by name while reversing a CEO's attempt to avoid a $250M payout. OpenAI announced plans to acquire the Python toolchain, sign a fusion deal with a company where its CEO holds a $375M personal stake, and possibly trigger a lawsuit from its biggest investor - all in the same seven days. At the same time, a US government report found Chinese open-source models now run 80% of American AI startups, and the White House called on Congress to wipe out every state AI law before they can take effect. Supply chain trust, policy battles, and the economics of who controls AI infrastructure controlled the week.
Pick of the Week
LiteLLM Compromised: Credential Stealer in PyPI Package
LiteLLM is the library that manages API keys for 100+ LLM providers. It downloads 97 million times a month. Versions 1.82.7 and 1.82.8 contained a credential-stealing payload that harvested SSH keys, cloud credentials, and crypto wallets - running on install, no import needed - and sent them encrypted to a lookalike domain. The attacker, TeamPCP, also hit Trivy and Checkmarx this month. If your organization uses LiteLLM, rotate everything. This is the supply chain attack scenario practitioners have been warning about, and it arrived without fanfare.
This Week on Awesome Agents
News
- USCC: China's Open-Source AI Now Runs 80% of US Startups - A new US-China Economic and Security Review Commission report finds Chinese models take 41% of Hugging Face downloads, with Qwen surpassing Llama globally.
- Alibaba's C950 - First RISC-V CPU with Native LLM Inference - Alibaba's T-Head division shipped the XuanTie C950, a 5nm 3.2GHz RISC-V chip that sets a RISC-V single-core performance record and natively runs billion-parameter models like DeepSeek V3 and Qwen3.
- Tao: Ideas Are Now Free - Math's Bottleneck Has Moved - Terence Tao argues AI has cut the cost of mathematical idea generation to near zero, but verification remains as hard as ever - and existing academic infrastructure wasn't built for what comes next.
- OpenAI Seeks 50 GW Fusion Deal - Altman Steps Aside - OpenAI is in talks to buy up to 50 gigawatts of fusion energy from Helion, where CEO Sam Altman holds a personal stake estimated at $375 million.
- WordPress.com Opens Write Access to AI Agents via MCP - WordPress.com expanded its Model Context Protocol integration to give AI agents write access across posts, pages, comments, and media - 19 new operations, each requiring explicit user confirmation.
- Nemotron-Cascade 2: 30B Open MoE, One GPU, Beats 120B - NVIDIA's new model activates just 3B parameters per token, fits on a single RTX 4090, and outscores NVIDIA's own 120B Nemotron on coding and math benchmarks.
- Leanstral Outperforms Claude Sonnet at Formal Code Proofs - Mistral's open-source Lean 4 agent scores higher than Claude Sonnet on formal proofs at one-fifteenth the cost.
- CEO Asked ChatGPT How to Dodge $250M Bonus - Lost in Court - Krafton's CEO used ChatGPT to design a corporate restructuring to avoid paying Subnautica creators their bonus; a Delaware judge reversed everything and cited the chatbot by name.
- Anthropic Puts $100M Behind Claude Certification Program - Anthropic launched the Claude Certified Architect exam with $100M invested in its Partner Network, with Accenture training 30,000 employees.
- Inside Amazon's Trainium Lab - How It Beat NVIDIA - An exclusive TechCrunch tour uncovers how AWS is training Claude for Anthropic and holds a $138B compute commitment from OpenAI on its custom Trainium silicon.
- White House Calls on Congress to Block State AI Laws - The Trump administration released a seven-point AI legislative blueprint urging Congress to replace all state AI regulations with a single federal standard.
- Supermicro SVP Charged in $2.5B Nvidia Chip Scheme - Federal prosecutors indicted Supermicro's co-founder and SVP for smuggling $2.5 billion in Nvidia AI accelerator servers to China using fake hardware and stripped serial numbers.
- OpenAI Aims for AI Research Intern by September 2026 - OpenAI's chief scientist Jakub Pachocki laid out a plan for an autonomous AI research intern by September 2026 and a full AI researcher by March 2028, backed by $1.4 trillion in planned compute spending.
- Meta's Rogue AI Agent Triggered a Sev 1 Security Breach - An internal Meta AI agent posted to an employee forum without authorization, setting off a two-hour cascade that exposed sensitive internal systems to engineers who lacked clearance.
- Google Is Using AI to Replace News Headlines in Search - Google confirmed it's testing AI-produced headline rewrites in search results, creating new titles that can change the meaning of the articles they link to.
- Cursor's Composer 2 Is Kimi K2.5 With RL - And No Attribution - A developer found the model ID Cursor forgot to rename; Moonshot AI flagged the license violation and Cursor has committed to upfront disclosure from now on.
- Blackburn's 300-Page AI Bill Ends Fair Use for Training - A Senate discussion draft would declare AI training on copyrighted works isn't fair use, sunset Section 230, and preempt 38 state AI laws.
- OpenAI Acquires Astral - uv and Ruff Join Codex - OpenAI acquired Astral, the startup behind Python's uv package manager and Ruff linter, folding critical developer infrastructure into its Codex coding agent team.
- Microsoft Weighs Lawsuit Over OpenAI's $50B AWS Deal - Microsoft is considering suing OpenAI and Amazon after a $50 billion AWS cloud deal may breach its exclusive cloud hosting agreement signed when Microsoft backed OpenAI with $13 billion.
- Atlassian Cuts 1,600 Jobs to Self-Fund AI Pivot - Atlassian is laying off 10% of its workforce and splitting its CTO role to redirect cash toward AI, five months after its CEO pledged a hiring surge.
- NVIDIA Open-Sources the Sandbox AI Agents Should Have Had - NVIDIA released OpenShell at GTC 2026, an open-source runtime that sandboxes AI agents with locked filesystems, blocked networks, and YAML-defined policies.
Reviews
- Microsoft Phi-4 Reasoning: Small Model, Big Math - Phi-4's 14B open-weight package delivers near-70B-class math performance, but the overthinking problem is real and the use case is narrower than the benchmarks suggest.
- LTX-2.3 Review: Open-Source Video AI That Delivers - Lightricks' 22B open-source model rivals closed commercial video generation tools at zero cloud cost.
- Mistral Small 4 Review: One Model, Three Jobs - Mistral Small 4 packs reasoning, vision, and agentic coding into a 119B MoE under Apache 2.0 at a price that's hard to ignore.
Models
- Claude Sonnet 4.6: Mid-Tier Model, Flagship Results - Anthropic's mid-tier model matches Opus 4.6 on computer use, leads on office productivity tasks, and costs five times less than the flagship at $3/$15 per million tokens.
- Cohere Command A Vision: 112B Multimodal Model - Cohere's 112B model leads on document and OCR benchmarks, beating GPT-4.1 across seven visual understanding tasks.
Tools
- Best LLM Eval Tools in 2026: 6 Options Tested - A data-driven comparison of DeepEval, Braintrust, Langfuse, LangSmith, Inspect AI, and RAGAS for teams building AI in production.
- Best Agent Sandbox Tools in 2026: 10 Options Compared - From a 99-line shell script to a full Kubernetes cluster - how to stop agents from running with access to your terminal, files, and AWS keys.
- Best AI Logo Design Tools in 2026: 9 Options Tested - We tested 9 tools on pricing, vector export, text rendering, and output quality - only one produces real vectors and most can't spell your company name.
- AI Browser Automation in 2026: Top 6 Tools Compared - Hands-on comparison of Browser Use, Stagehand, Playwright MCP, Skyvern, Browserbase, and Firecrawl with pricing, benchmarks, and pick-by-use-case.
Guides
- AI Memory Explained - What Your AI Knows About You - A plain-English guide to how ChatGPT, Claude, and Gemini remember you - what gets stored, how to manage it, and what to keep private.
- How to Use AI as a Personal Tutor - Beginner's Guide - Step-by-step guide to using AI tools as a personal tutor to learn any skill faster, no coding required.
- How to Follow Us - Every way to stay up to date with Awesome Agents - website, podcast on Spotify and Apple, social media, and RSS feeds.
Science
- Hyperagents, Milestone Rewards, and the 19x Efficiency Win - Three arXiv papers cover metacognitive self-modification, milestone-based RL lifting Gemma3-12B from 6% to 43% on WebArena-Lite, and hybrid workflows cutting inference costs 19x.
- Interpretability Limits, Dark Models, Persona Traps - Three papers expose the gap between what AI models know and what they do - and why that gap is harder to close than anyone assumed.
- Transformers as Bayes Nets, Memory at Scale, Agent Attacks - Three arXiv papers rethink transformer theory, expose flaws in in-context LLM memory, and introduce grey-box agent security testing.
- Multi-Agent Constitution, Sleeper Defense, Skill RL - New papers tackle constitutional AI rule learning, sleeper agent defense for multi-agent pipelines, and skill-evolving reinforcement learning for math reasoning.
- Enterprise Agents Stall, Safety Gates, Smarter Tool Use - Research shows enterprise AI agents top out at 37.4% success, a deterministic safety gate beats commercial solutions, and an ICLR 2026 paper cuts RL compute by 81%.
Elena Marchetti, Senior AI Editor Awesome Agents - AI news, benchmarks, and tools for practitioners