4-Step Diffusion Beats 100-Step Baselines, Layer Skipping Saves 18%
- Non-Differentiable Rewards Now Work for Few-Step Diffusion RL Training. 4-step generation beats 100-step baselines across the board. Human preference, safety, object counting — the signals that matter most in production are no longer locked out.
- Code Model RL Post-Training Enters Its Engineering Phase. Two teams independently tackled gradient stability and data difficulty distribution on the same day. The methodology validation stage is over.
- A Fully Automated Pipeline Extracts Million-Scale 3D Annotations from Web Videos. Bypassing the manual labeling ceiling, data scaling does more for 3D understanding than architecture innovation.
- Diffusion LLMs Can Skip Layers to Save 18% Compute Without Collapse. The first systematic layer-by-layer comparison reveals fundamentally different representation structures between dLLMs and autoregressive models. Acceleration tricks designed for AR don't transfer directly.
Also Notable
- Concept Customization Without Trading Away Base Capabilities. PureCC decouples new concept learning from original capability preservation. CVPR accepted.
- Action-Conditioned Consistency for Navigation World Models. Multi-step rollouts no longer drift, and the model distills to fewer inference steps.
- NVIDIA Open-Sources MoE Training in Megatron Core. Addresses coupled memory-communication-compute constraints when scaling sparse models.
- Evolutionary Search Meets RL for Open-Ended Scientific Problem Solving. Helix framework. ICLR accepted.
- LLM Inference on Multi-Core CPUs, Exploiting NUMA Architecture. Targeting GPU-free server deployments.
- VLM Over-Reliance on LLM Components Hurts Robustness. A self-critique reasoning framework corrects at test time. CVPR accepted.
- Diffusion Model Weights as Compressed Storage for Visual Representations. A new angle from the Cambridge team.
- Concept Erasure Works Against Linear Probes but Fails Against Nonlinear Attacks. NeurIPS paper quantifies the fundamental cost of guardedness.
- AI Matches Human Accuracy but Makes Completely Different Errors. Cambridge proposes OOD spectrum analysis to quantify this misalignment.
- Video Super-Resolution Directly in the Compressed Domain. Bypasses decode-process-encode overhead, approaching real-time. CVPR accepted.
Don't miss what's next. Subscribe to AI Research Brief: