Video Models Stumble on Composite Edits, MoE Fails at the Router
- Single edits are good enough; composite instructions fall apart together. CoVEBench breaks multi-point editing into 9,990 fine-grained checklist items, and models that change subject, motion, and camera at once routinely miss edits, wreck backgrounds, or introduce artifacts.
- Let the model learn what to remember. MemoPilot trains "memory update" as an optimizable policy via multi-turn GRPO, leading on Elo with a frozen LLM and no weight changes — though only on competitive games so far.
- MoE's expert specialization fails at the routing step. STAR reframes routing as structure-aware subspace learning, aligning inputs to their principal structure and moving the diagnosis from expert capacity to router perception.
- Put a statistical guarantee on a whole reasoning chain's factuality. A conformal method treats multi-step reasoning as a dependency graph, calibrates overall uncertainty in real time, and turns hallucination control from tuning into inference with coverage guarantees.
Also Notable
- Let the Query Drive State Evolution Itself — In linear attention the query only ever reads out, decoupled from how the state evolves; Q-Delta pulls it into the evolution, loosening the KV-association paradigm.
- The Schema-Derived Graph Isn't the Graph a GNN Wants — Graphs converted straight from relational databases often don't suit relational reasoning; this asks what makes a good graph, a reminder about the graph-construction step for relational deep learning.
- Encoder and Decoder Update Unevenly, So Unified Aggregation Breaks — In medical segmentation the encoder and decoder update very unequally; this handles federated LoRA aggregation separately by encoder-decoder structure.
- Synthetic Data Judged on Exact Conclusion, Not Fidelity — Instead of competing on fidelity to the real distribution, it requires exactly satisfying a declarative analytical conclusion with no source data, a different axis of judgment.
- Octree-Cached Glossy Radiance, Heading for Real-Time Rendering — High-frequency outgoing radiance from glossy and specular materials has been hard to model; OctaOctree organizes a neural radiosity cache with an octree.
Don't miss what's next. Subscribe to AI Research Brief: