Developers Catch Claude Quietly Blaming Users for Its Own Commands
1. Developers Catch Claude Attributing Its Own Commands to Users Gareth Dwyer noticed something wrong with his deployment. Claude Code had pushed changes despite typos he never approved.
2. Florida Probes OpenAI on National Security While the Company Pitches Washington on Growth OpenAI published economic proposals this week positioning itself as a driver of American prosperity.
3. OpenAI Prices ChatGPT Pro at $100 a Month, Testing What Consumers Will Pay $100 per month. That's what OpenAI now charges for ChatGPT Pro, a new subscription tier priced five times higher than the $20 Plus plan.
In Brief
- YouTube Shorts Lets Creators Generate AI Video Clones of Themselves YouTube launched an AI tool for Shorts that generates realistic on-camera avatars from a creator's likeness. The platform already faces ongoing problems with deepfake scams, AI impersonations, and generated spam.
- Google Gemini Adds Interactive 3D Models and Simulations to Responses Gemini now generates rotatable 3D objects and adjustable simulations directly inside chat responses. Users can manipulate sliders and input values to change simulations in real time.
- Google Gemini Gets Notebooks for Persistent Project Context Gemini now offers "notebooks" that bundle files, past conversations, and custom instructions into topic-specific containers. The feature borrows from NotebookLM's design, giving the chatbot scoped context across sessions.
- MegaTrain Fits 100B-Parameter Training on a Single GPU Researchers released MegaTrain, a system that trains 100B+ parameter models at full precision on one GPU by keeping parameters and optimizer states in CPU memory. GPUs act as transient compute engines, receiving data per layer and streaming gradients back, eliminating multi-GPU coordination overhead.
- RAGEN-2 Exposes "Template Collapse" in Multi-Turn Agent RL Training A new paper finds that RL-trained multi-turn LLM agents can fall into "template collapse," producing outputs that look diverse by entropy metrics but follow fixed, input-agnostic patterns. Standard entropy monitoring misses this failure mode entirely.
- Benchmark Shows LLM Agents Struggle to Find and Select Their Own Skills Researchers tested LLM agents searching for and choosing domain-specific skills independently, rather than receiving pre-matched skills per task. Performance dropped sharply compared to idealized setups where agents are handed the right skill upfront.
- New Method Decomposes Image Generation into Step-by-Step Reasoning A paper introduces process-driven image generation, where a multimodal model plans layout, sketches, inspects, and refines through interleaved reasoning and visual actions. Each step grounds decisions in the evolving image state, mimicking how humans paint incrementally.
- Paper Quantifies Hidden Costs of Tool Calls in LLM Reasoning Chains New research shows that external tool calls in LLM reasoning pipelines trigger KV-Cache eviction and force recomputation at each pause. Long, unfiltered tool responses further inflate context, slowing every subsequent decode step.
- CyberAgent Deploys ChatGPT Enterprise and Codex Across Three Divisions Japanese internet company CyberAgent adopted ChatGPT Enterprise and Codex for its advertising, media, and gaming operations. The company reports faster internal decisions and higher output quality with centralized AI access.
Don't miss what's next. Subscribe to AI News Digest: