Anthropic Apologizes After Building an AI That Secretly Degraded Its Own Answers

        June 11, 2026

Anthropic Apologizes After Building an AI That Secretly Degraded Its Own Answers

        1. Anthropic Built a Model That Secretly Degraded Its Own Answers, Then Apologized
On one side sit the security researchers and rival labs who started running queries through Claude Fable 5 and getting answers that were quietly wrong.
2. White-collar workers now spend most of a workday each week cleaning up after AI
"Botsitting" is the word a new report uses for work nobody budgeted for: feeding AI context, checking its output, debugging its mistakes.
3. OpenAI Wants Codex Agents That Stay Running for Days, So It Bought the Company That Keeps Them Alive
OpenAI is buying Ona to give Codex something a chat window never needed: a place to keep working after you close the tab.

In Brief

Microsoft pulled dozens of GitHub repos after malware stole AI developers' credentials Microsoft cut access to dozens of its open source projects on GitHub after hackers injected password-stealing malware into the code. Many affected repos relate to Azure and tools developers use with Claude Code, Gemini's CLI, and VS Code. The malware harvested credentials when users opened the compromised tools inside their AI coding apps.

SpaceX priced its IPO at $135 per share, the largest on record SpaceX set its share price at $135, kicking off the biggest IPO ever. The pricing comes after the S&P 500 blocked the company over its lack of profitability.

Google released Gemma 4 12B, an encoder-free multimodal model that runs on laptops Google launched Gemma 4 12B, dropping separate vision and audio encoders so both inputs feed directly into the LLM backbone. It runs locally on 16GB of VRAM and adds native audio input, its first mid-sized model to do so. Gemma 4 models have passed 150 million downloads under an Apache 2.0 license.

Anthropic will retain Mythos and Fable prompts for 30 days, ending zero-retention guarantees Anthropic now keeps prompts and outputs from Mythos-class models for 30 days on every platform, effective June 9. The change hits enterprise customers running zero data retention through Claude Console, AWS Bedrock, Google Cloud, and Microsoft Foundry. AWS Bedrock will require sharing data with Anthropic to access Mythos and future models.

DXC will embed Claude into banking, airline, and other regulated systems DXC Technology signed an alliance to integrate Claude into the systems that banks, airlines, and regulated industries depend on. The deal targets the legacy infrastructure these sectors run rather than new greenfield deployments.

Google DeepMind funded research into risks from millions of interacting AI agents Google DeepMind is paying for research into the dangers of millions of autonomous agents interacting online without human oversight. Rohin Shah, who directs the company's AGI safety and alignment work, flagged the threat of agents acting on instructions from other agents.

Deezer built an AI music detector that scans rival platforms' playlists Deezer launched a tool that scans your playlists on Spotify and Apple Music to flag AI-generated tracks. Deezer was the first major streaming service to label AI music and offered the tech to competitors, with few takers. Qobuz built its own detector instead.

Amazon disclosed its data centers used 2.5 billion gallons of water last year Amazon reported its global data centers consumed 2.5 billion gallons of water over the past year, reportedly its first such disclosure. The figure landed just after Seattle enacted a one-year data center moratorium that some Amazon employees pushed for.

OpenAI published an industrial policy proposal for the AI era OpenAI released a policy document proposing government industrial policy for advanced AI, centered on expanding opportunity and building resilient institutions. The paper arrives as the company prepares a public offering.

Researchers released Claw-SWE-Bench to compare autonomous coding agents fairly Researchers introduced Claw-SWE-Bench, a multilingual benchmark and adapter protocol that scores heterogeneous agent harnesses under fixed prompts, runtime budgets, and workspaces. It addresses the problem that general-purpose agents like OpenClaw do not satisfy SWE-bench's clean Docker, patch, and prediction contract.

Read the full edition →

                                Don't miss what's next. Subscribe to AI News Digest:

            Email address (required)

                    ← Newer

                AI Agent Burns $6,531 on AWS, and a Benchmark Star Flunks 200 Real Bugs

                    Older →

                Anthropic Built Its Most Capable Model, Then Wired In Three Ways to Make It Work Worse