MoE Safety Lives in a Few Experts, Exclusive Batching Adds 42%

        June 2, 2026

MoE Safety Lives in a Few Experts, Exclusive Batching Adds 42%

Lab VLM scores don't survive robot deployment. RoboStressBench breaks physical rendering into material, lighting, viewpoint, and geometry stress, showing that aggregate accuracy hides where a model actually fails.

The compute MoE saves may come straight out of the safety guardrails. Safety capability concentrates in a handful of experts, and routing around them leaves the guardrails useless.

Parameter-level knowledge editing has a ceiling. Patching facts directly into weights reliably damages core abilities under realistic conditions, while a simple retrieval baseline stays stronger throughout.

Mixed batching isn't always the best call, and the crossover point lives in memory bandwidth. On bandwidth-limited budget cards, exclusive batching squeezes out up to 41.9% more throughput.

A model recognizes its own writing through one fixed reference frame. Anthropic finds the model judges any persona's text using the assistant as an anchor for an implicit Bayesian likelihood-ratio test.

Also Notable

Streaming Real-Time Narration of Long Videos with a Multimodal LLM — FlowNar targets the scalability bottleneck where resource use grows linearly with video length in online settings.
Reconstructing Dark Matter's 3D Distribution from Weak Lensing with a Generative Diffusion Prior — a single-view, severely ill-posed inverse problem where traditional reconstruction struggles to converge; the generative prior constrains the solution space.
Enriching Evidence Scattered Across Figures, Tables, Captions, and Text in Biomedical Papers into Training Data — Ryze uses this to sidestep expensive expert labeling and improve VLM reliability on biomedical QA.
Curbing "Catastrophic Overfitting" in Fast Adversarial Training with a Nearly Free Second-Order Attack — SORA makes single-step adversarial training both cheap and stable.
When LLMs Annotate and Judge Zero-Shot, the Model's Own Priors Fight Your Instructions — this work dissects when priors override instructions, which bears directly on LLM-as-judge reliability.
Stabilizing Visual Grounding in Remote Sensing with Cluster-Guided Refinement and Multi-Model Voting — cracks the old problem of unreliable single-model grounding under small targets and large scale variation.
Counterintuitive Transfer Learning: the Source Domain Needn't Be Semantically Clean, Try Transferring from a "Noisy Domain" — noisy-domain adaptation under a semi-supervised setting.
Online Link Recommendation Is Performative — What You Recommend Changes Which Links Form Next — that makes fairness computed on historical logs drift after deployment, which COPF aims to stabilize.

Read the full edition →

                                Don't miss what's next. Subscribe to AI Research Brief:

            Email address (required)