Video Models Stumble on Composite Edits, MoE Fails at the Router

        June 9, 2026

Video Models Stumble on Composite Edits, MoE Fails at the Router

Single edits are good enough; composite instructions fall apart together. CoVEBench breaks multi-point editing into 9,990 fine-grained checklist items, and models that change subject, motion, and camera at once routinely miss edits, wreck backgrounds, or introduce artifacts.

Let the model learn what to remember. MemoPilot trains "memory update" as an optimizable policy via multi-turn GRPO, leading on Elo with a frozen LLM and no weight changes — though only on competitive games so far.

MoE's expert specialization fails at the routing step. STAR reframes routing as structure-aware subspace learning, aligning inputs to their principal structure and moving the diagnosis from expert capacity to router perception.

Put a statistical guarantee on a whole reasoning chain's factuality. A conformal method treats multi-step reasoning as a dependency graph, calibrates overall uncertainty in real time, and turns hallucination control from tuning into inference with coverage guarantees.

Also Notable

Let the Query Drive State Evolution Itself — In linear attention the query only ever reads out, decoupled from how the state evolves; Q-Delta pulls it into the evolution, loosening the KV-association paradigm.
The Schema-Derived Graph Isn't the Graph a GNN Wants — Graphs converted straight from relational databases often don't suit relational reasoning; this asks what makes a good graph, a reminder about the graph-construction step for relational deep learning.
Encoder and Decoder Update Unevenly, So Unified Aggregation Breaks — In medical segmentation the encoder and decoder update very unequally; this handles federated LoRA aggregation separately by encoder-decoder structure.
Synthetic Data Judged on Exact Conclusion, Not Fidelity — Instead of competing on fidelity to the real distribution, it requires exactly satisfying a declarative analytical conclusion with no source data, a different axis of judgment.
Octree-Cached Glossy Radiance, Heading for Real-Time Rendering — High-frequency outgoing radiance from glossy and specular materials has been hard to model; OctaOctree organizes a neural radiosity cache with an octree.

Read the full edition →

                                Don't miss what's next. Subscribe to AI Research Brief:

            Email address (required)