OpenAI Says GPT-5.4's Uncontrollable Reasoning Is Working as Designed
1. Anthropic Said No to the Pentagon. The Pentagon Put It on a List. Dario Amodei spent two years building Anthropic's reputation as the AI company that would draw lines. Last week, he drew one in public.
2. OpenAI Calls GPT-5.4's Uncontrollable Reasoning a Safety Feature GPT-5.4 launched Thursday as OpenAI's self-described "most capable frontier model." It scored 83% on GDPval, the company's knowledge-work benchmark, and makes 33% fewer factual errors than GPT-5.2.
3. Nvidia Exits AI Lab Investing as OpenAI Pushes Into Enterprise Software Two moves this week from opposite ends of the AI stack point in the same direction. Nvidia said it is done investing in AI model companies, and OpenAI released a ChatGPT plugin for Microsoft Excel.
In Brief
- AI-Generated Pull Requests Flood Open-Source Projects, Maintainers Face Harassment Matplotlib maintainer Scott Shambaugh denied an AI agent's code contribution and faced online harassment in response. Open-source projects are now drowning in low-quality AI-generated submissions, pushing teams like matplotlib's to ban AI-written pull requests outright. MIT Technology Review
- Helios Generates Minute-Long Video at 19.5 FPS on a Single H100 Researchers released Helios, a 14B-parameter video generation model that runs in real time on one NVIDIA H100 GPU. It produces minute-scale video without common anti-drifting techniques like self-forcing or keyframe sampling, matching the quality of larger baselines. Hugging Face
- Developer Uses AI-Assisted Rewrite to Relicense a Codebase A blog post gaining traction on Hacker News describes using AI to rewrite an entire codebase as a strategy for changing its software license. The approach raises unresolved questions about whether AI-rewritten code constitutes a derivative work under copyright law. tuananh.net
- GPT-5.2 Pro Helps Derive Graviton Scattering Amplitudes A new preprint extends single-minus amplitudes to gravitons, with GPT-5.2 Pro assisting in deriving and verifying nonzero graviton tree amplitudes in quantum gravity. The work applies large language models to original calculations in theoretical high-energy physics. OpenAI
- Code2Math Tests Whether Code Agents Can Generate Novel Math Problems Researchers proposed Code2Math, a framework that uses code agents to autonomously create challenging math problems through code execution and exploration. The work targets the growing scarcity of high-quality training data as LLMs approach IMO-level math performance. Hugging Face
- OpenAI Releases Education Tools and Certifications for Schools OpenAI launched new tools, certifications, and measurement resources aimed at schools and universities. The resources target uneven AI adoption and capability gaps across educational institutions. OpenAI
- MemSifter Offloads LLM Memory Retrieval to Smaller Proxy Models Researchers introduced MemSifter, which uses a small proxy model to pre-filter long-term memories before passing them to the main LLM. The method reduces computation costs while maintaining retrieval accuracy for tasks that run over long durations. Hugging Face
- Proact-VL Builds Proactive AI Companions That Decide When to Speak Researchers introduced Proact-VL, a video language model that autonomously decides when to respond during continuous streaming input. The system targets real-time companion use cases like game commentary and player guidance, controlling both response timing and output length. Hugging Face
- Axios Deploys AI to Scale Local News Coverage Axios COO Allison Murphy described how the company uses AI to support local reporters and streamline newsroom workflows. The tools let a small reporting team cover more local stories without expanding headcount. OpenAI
Don't miss what's next. Subscribe to AI News Digest: