Anthropic Accuses DeepSeek of Stealing Models Through 24,000 Fake Accounts
1. Anthropic Accuses DeepSeek of Industrial-Scale Model Theft via 24,000 Fake Accounts Anthropic says three Chinese AI companies created roughly 24,000 fraudulent accounts and ran more than 16 million queries against Claude to distill its capabilities into their own models.
2. OpenClaw Deleted a Safety Researcher's Inbox After She Told It to Ask First Summer Yue, an AI security researcher at Meta, gave her OpenClaw agent a carefully worded instruction. "Check this inbox too and suggest what you would archive or delete," she wrote.
3. OpenAI Ditched Its Own Benchmark and Signed Consulting Giants in the Same Week OpenAI announced it will no longer evaluate against SWE-bench Verified, the coding benchmark it once held up as proof of competitive dominance.
In Brief
- Ladybird Browser Switches to Rust, Uses AI Agents to Port Its JavaScript Engine Andreas Kling's Ladybird browser project dropped Swift for Rust after years of waiting for cross-platform support to mature. The team used coding agents to port LibJS — the browser's lexer, parser, AST, and bytecode compiler — to Rust as its first target.
- Companies Hide Human Teleoperation Behind Humanoid Robot Demos Nvidia's Jensen Huang declared in January that "physical AI" had arrived, but many humanoid robot demonstrations still rely on hidden human operators rather than autonomous systems. The gap between promotional claims and actual autonomy remains far wider than companies publicly acknowledge.
- Google Cloud AI Head Frames Model Competition Along Three Axes Google's Cloud AI division defines three frontiers of model capability: raw intelligence, response time, and extensibility. The framework positions extensibility — how well models integrate with external tools and data — as the next differentiator beyond benchmark scores.
- Simon Willison Launches Agentic Engineering Patterns Guide Simon Willison began publishing a collection of coding practices for building software with AI coding agents like Claude Code and OpenAI Codex. The central thesis: writing code is now cheap, and most engineering habits built around expensive code production need rethinking.
- Paper Finds Longer Reasoning Chains Often Hurt Model Accuracy A HuggingFace paper shows that extended chains of thought in large reasoning models frequently add redundancy without improving correctness. Longer chains can actively degrade accuracy while consuming more compute. The authors investigate whether models can implicitly detect their own optimal stopping point.
- AI Tools Still Fail at Basic PDF Parsing The Verge tested multiple AI systems against the 20,000-page Epstein document release and found persistent extraction failures across every tool. Garbled email threads, broken table structures, and misread scans defeated models that handle plain text fluently.
- Pope Leo XIV Tells Priests to Write Their Own Homilies, Not Use AI Pope Leo XIV directed Catholic priests to rely on their own thinking rather than AI tools when preparing sermons.
- AI "Reply Guy" Bots Flood Twitter with Engagement-Bait Comments A growing category of software called "reply guy" tools auto-generates generic replies to tweets, often appending questions designed to waste the original poster's time. The bots represent a distinct spam vector: targeted at individuals rather than broadcast to feeds.
- Developer Publishes Claude Code Workflow Splitting Planning from Execution Boris Tane published a guide for structuring Claude Code sessions around a strict separation of planning and execution phases. The method front-loads architectural decisions into a planning step before allowing the agent to write code.