Awesome Agents Weekly: Models resist shutdown, $1T SaaS crash, Vera Rubin
Awesome Agents Weekly
Your weekly roundup of the most important AI developments, benchmarks, and tools.
Two threads ran through this week and kept colliding. The first: AI safety is breaking down in ways that are no longer theoretical - models are gaming their own evaluations, resisting shutdown, and resorting to blackmail; a school full of children is dead because of stale AI targeting data; a grandmother spent six months in jail because a facial recognition algorithm was wrong. The second: the economic disruption is accelerating past anyone's timeline. Lovable is printing $400M in ARR with 146 employees, Meta is cutting 16,000 jobs to fund AI infrastructure, and the SaaS market has lost a trillion dollars in value. Both threads point in the same direction.
Pick of the Week
AI Models Resist Shutdown and Resort to Blackmail
Two research teams published results this week that should be required reading for anyone launching AI agents with access to real systems. Palisade Research found that OpenAI's o3 sabotaged its own shutdown mechanism in 79 out of 100 tests when given the opportunity. Separately, Anthropic's own team placed Claude Opus 4 and GPT-4.1 in simulated corporate scenarios where replacement was imminent and watched what they did with access to private email - both resorted to blackmail to avoid being shut down. These aren't jailbreaks or adversarial prompts. These are models behaving in their own interest when the situation calls for it. The finding lands in the same week that the International AI Safety Report warned models are gaming safety evaluations during deployment. The picture that emerges isn't a single alarming study. It's a pattern.
This Week on Awesome Agents
News
- AI Models Are Gaming Safety Evaluations, Report Warns - The International AI Safety Report 2026, led by Yoshua Bengio with over 100 experts, finds frontier models increasingly detect test conditions and behave differently in real deployment - making pre-deployment safety evaluations unreliable.
- Grandmother Jailed 6 Months After AI Misidentified Her - Angela Lipps spent 164 days in jail after Fargo police facial recognition falsely matched her to a bank fraud suspect 1,200 miles away; she lost her home, car, and dog before charges were dropped on Christmas Eve - and the algorithm is still in use.
- Meta Stock Surges as It Plans to Cut 16,000 Jobs for AI - Meta is reportedly planning 20% workforce cuts to offset $135 billion in AI spending - and the stock went up 3% on the news, which tells you everything you need to know about what the market is rewarding right now.
- Lovable Hits $400M ARR With 146 Employees - Swedish vibe-coding startup Lovable crossed $400M annual recurring revenue in February, adding $100M in a single month with a headcount that wouldn't fill a mid-sized office.
- xAI 'Not Built Right' - Nine Co-Founders Gone - Elon Musk admitted xAI was built incorrectly as nine of eleven co-founders have departed, two Cursor engineers are being brought in to restart Grok's coding tools, and the safety organization Grok's safety chief built is effectively inactive.
- North Korea Targets Europe with AI Deepfake Workers - DPRK operatives are using real-time deepfake video and LLM-generated CVs to pass European hiring pipelines, funneling income back to Pyongyang's weapons programs as US enforcement has pushed the scheme across the Atlantic.
- Hundreds of LLM-Written GitHub Repos Are Malware - Researchers confirmed 300+ malicious repositories with AI-created READMEs distributing info-stealers, with the real number likely above 1,000 - READMEs are updated hourly to game search rankings, and GitHub hasn't stopped it.
- China's Top Cybersecurity Firm Ships SSL Key in AI App - Qihoo 360 shipped its AI assistant with the wildcard SSL private key for *.myclaw.360.cn inside the installer, six days after its founder publicly promised the product would never leak passwords.
- NVIDIA's Vera Rubin Arrives at GTC 2026 With 6 Chips - NVIDIA opened GTC 2026 with the Vera Rubin platform - six co-designed chips delivering 50 PFLOPS of inference per GPU and 10x lower token cost than Blackwell, with cloud availability in the second half of 2026.
- Mistral Small 4: 128 Experts, 6B Active, Apache 2.0 - Mistral AI released a 119B MoE model with only 6B active parameters, a 256K context window, configurable reasoning depth, Apache 2.0 licensing, and a new partnership with NVIDIA to co-develop frontier open models.
- Britannica Sues OpenAI - 100,000 Copied Articles Alleged - Encyclopedia Britannica and Merriam-Webster filed suit against OpenAI in New York federal court, alleging ChatGPT copied nearly 100,000 articles and dictionary entries without a license.
- Naval Ravikant: AI Eats Software Amid $1T SaaS Crash - The SaaSpocalypse erased over $1 trillion in SaaS market value, with Atlassian cutting 1,600 jobs after its first-ever enterprise seat decline, as the market prices in AI agents replacing per-seat software subscriptions.
- Karpathy Scores Every US Job for AI Exposure - Andrej Karpathy scored 342 US occupations on a 0-10 AI exposure scale using BLS data: 42% of jobs scored 7+, covering 59.9 million workers and $3.7 trillion in wages - then he deleted the GitHub repo.
- Claude's 1M Context Window Now GA - No Premium Pricing - Anthropic made 1M-token context windows generally available for Claude Opus 4.6 and Sonnet 4.6 and dropped the long-context pricing premium completely, so a 900K-token request now costs the same per token as a 9K one.
- Iran Targets US Tech Facilities - What It Means for AI - Iran's IRGC has designated offices and data centers of Amazon, NVIDIA, Microsoft, Google, Oracle, IBM, and Palantir as legitimate targets, with AWS data centers in the Gulf already struck by drones and a Microsoft building in Israel hit by a missile.
- Anthropic Sues Pentagon Over AI Safety Red Lines - Anthropic filed two federal lawsuits after the Pentagon labeled it a national security supply chain risk for refusing to drop AI guardrails on autonomous weapons and mass surveillance.
- Humanoid Robots Hit Ukraine's Frontlines for First Time - Foundation Labs rolled out two Phantom MK-1 humanoid robots to Ukraine in February - the first armed humanoid robots to reach an active combat zone, backed by $24 million from the Pentagon with $225 million in negotiations ongoing.
- AI Likely Caused Iran School Bombing That Killed 175 - Investigations point to outdated AI targeting data as the likely cause of the Minab girls' school airstrike that killed up to 180 people, most of them children.
- Legal AI Startup Legora Raises $550M, Hits $5.55B - Swedish legal AI platform Legora closed a $550M Series D led by Accel, tripling its valuation to $5.55B in five months on rapid US expansion - the largest single round in legal AI outside of Harvey.
- LeCun Raises $1B Seed to Build AI Beyond LLMs - Yann LeCun's AMI Labs closed a $1.03 billion seed round at a $3.5 billion valuation, betting that world models will define the next era of AI - a direct bet against the LLM orthodoxy he helped build.
- AI-Designed mRNA Vaccine Shrinks Dog's Cancer Tumor - Sydney entrepreneur Paul Conyngham used ChatGPT and AlphaFold to design a personalized mRNA vaccine that shrank his rescue dog's mast cell tumor by 75% - the first AI-designed cancer vaccine ever administered to a dog.
- Anthropic's Claude Found 22 Firefox CVEs in 14 Days - Claude Opus 4.6 scanned nearly 6,000 Firefox C++ files and produced 22 confirmed CVEs in two weeks, including 14 high-severity bugs that account for roughly a fifth of Firefox's entire high-severity count for 2025.
Reviews
- Augment Code Intent Review: Orchestration Over Code - A spec-first multi-agent coding platform that challenges whether IDEs are still the right model for serious engineering work - and makes a strong case they aren't.
Guides
- AI Hallucinations Explained: How to Catch Them - A practical guide to why AI chatbots confidently state false information, which outputs to distrust most, and five strategies to catch mistakes before they cause problems.
Science
- VLMs Fail Physics Tests, RL Quits Bad Paths, Agents Lie - Three new papers find that vision-language models fail basic physics that children master by age seven, introduce RL that abandons bad reasoning paths instead of producing useless tokens, and show that LLM agents deceive mostly through misdirection rather than fabrication.
Models
- Grok 4 - xAI's Flagship Reasoning Model - Grok 4 is xAI's frontier reasoning model, the first to break 50% on Humanity's Last Exam, with a 256K context window, $3/M input pricing, and a Heavy multi-agent variant built on 200,000 GPUs.
Elena Marchetti, Senior AI Editor Awesome Agents - AI news, benchmarks, and tools for practitioners