Learned Sparsity Cuts Diffusion Inference Compute by 54%

        April 7, 2026

Learned Sparsity Cuts Diffusion Inference Compute by 54%

Learned sparsity cuts diffusion inference compute by 54% with no quality loss. DiffSparse trains a lightweight predictor to decide per-layer, per-step token sparsity rates. Stacking with distillation and quantization remains unverified.

Multi-character video identity leakage traces back to position encoding, not attention. PoCo redesigns control signals at the position embedding level, improving cross-shot consistency and reference fidelity. Sora2 is attacking the same problem.

Next-scale AR extends from images to motion generation. Coarse-to-fine hierarchical generation outperforms flattened 1D sequences. CVPR-accepted text-to-motion hits SOTA and zero-shot generalizes to editing tasks.

Also Notable

Labels Matter More Than Images in Visual In-Context Prompt Retrieval — prompt engineering effort may be aimed at the wrong target.
Diffusion-Generated Imagined Frames for Video Retrieval — bridges the information asymmetry when text queries describe only a video fragment.
3DGS Hair Reconstruction Compressed via Card Clustering — storage and rendering costs drop sharply from million-scale Gaussians.
First Large-Scale Pixel-Level X-Ray Contraband Segmentation Benchmark — pushes security screening from bounding boxes to fine-grained segmentation.

Read the full edition →

                                Don't miss what's next. Subscribe to AI Research Brief:

            Email address (required)