Archive • AI Research Brief • Buttondown

DeepSeek V4 Cuts KV to 13.5%, Video Memory Runs 10x Faster

June 10, 2026

DeepSeek V4 bakes "index then attend" into the main architecture. Decoding no longer keeps the full KV cache in VRAM. A Neural Memory Indexer fetches...

Video Models Stumble on Composite Edits, MoE Fails at the Router

June 9, 2026

Single edits are good enough; composite instructions fall apart together. CoVEBench breaks multi-point editing into 9,990 fine-grained checklist items, and...

Swap the Arm Without Retraining; VLMs See Both the Duck and the Rabbit

June 9, 2026

Swap a robot arm and the whole skill set breaks — the fix is rewiring, not retraining. RECENT writes skills as executable code and locally refactors only the...

dots.tts Hits 54ms First Packet, SWE Agent Self-Evolves Past 50%

June 8, 2026

Open-source TTS takes the continuous-latent route, with three design choices all aimed at deployment. dots.tts is a 2B continuous autoregressive speech...

Streaming Hand-offs Make Multi-Agent Sharper, ZipSplat Splats With 1/6 the Gaussians

June 6, 2026

Streaming Hand-offs Beat Waiting for the Full Chain. StreamMA pipelines adjacent agents so reliable early signals reach downstream sooner — average +7.3...

NVIDIA Packs Five Modalities Into One Set of Weights

June 5, 2026

NVIDIA Crams Language, Image, Video, Audio, and Action Into One Set of Weights. Cosmos 3 bets a single mixture-of-transformers can do every modality, and...

A 20B Search Agent Ties the Frontier by Offloading Its Bookkeeping

June 4, 2026

Deleting stale observations to save context follows an inverted-U, not a straight line. Sweep across 4B to 284B models and three retrievers: strong...

A 4B Web Agent Catches Up to Closed CUAs on a Few Thousand Trajectories

June 3, 2026

PEFT isn't just cheap fine-tuning — it's per-user persistent state. A framing paper recasts small adapters as local state attached to a shared trillion-...

Move to See: Top Model Reaches the Target Just 12% of the Time

June 2, 2026

Spatial intelligence shifts from passive understanding to active perception. TVR asks an agent to turn and step through a 3D room until its view matches a...

MoE Safety Lives in a Few Experts, Exclusive Batching Adds 42%

June 2, 2026

Lab VLM scores don't survive robot deployment. RoboStressBench breaks physical rendering into material, lighting, viewpoint, and geometry stress, showing...

LoRA as a Ruler for Memory, Reversed Video as a Free Counterfactual

June 1, 2026

Flip LoRA into a measuring stick and the real capacity of parametric memory falls out: it follows a power law you can estimate in advance, and a token-...

Reasoning Search Adds a Second Direction, World Models Add Agents

May 31, 2026

BES Splits Inference-Time Search From One-Way Expansion. Forward search gains evolutionary operators to escape the model prior's "entropy shell," and...

Agents Start Improving Themselves, and Reaching for Fewer Tools

May 30, 2026

A Chinese MoE puts "self-evolution" on the roadmap. MiniMax-M2 runs 230B params with only 9.8B active, built end-to-end for agent work, and its latest...

Agent Trajectories Let a 30B Match a 235B

May 25, 2026

ACC Repackages Agent Tool-Use Trajectories as Long-Context QA Pairs. Qwen3-30B trained on them lifts MRCR from 50.2 to 68.3, matching Qwen3-235B-A22B at...

Gated DeltaNet-2 Splits the Gate, Maestro Outscores GPT-5

May 23, 2026

Linear Attention's Real Bottleneck Is State-Edit Granularity, Not Speed. Gated DeltaNet-2 splits the scalar gate into channel-wise erase and write gates. It...

Optimizer Choice Stretches Capacity Scaling 2.3x

May 22, 2026

Three Classes of Physical 3D Assets Merge Into One Pipeline. PhysX-Omni puts rigid, deformable, and articulated objects into a single framework. Output...

$15 Per Paper, Healthcare Agents Cap at 28%

May 22, 2026

Auto-Research Cost Curve Has Crossed. $15 produces a full paper, but frontier LLMs still fabricate results and miss errors. End-to-end autonomy still falls...

8% of Tokens Decide the Reasoning Gap

May 19, 2026

"Unlearnable" Samples in RLVR. A set of hard examples never gets learned across training, even though rollouts produced correct answers. The reward curve...

Real-Time Video's Bottleneck Moved Past Step Count

May 17, 2026

Real-Time AR Video's Bottleneck Is Shifting. Causal Forcing++ pushes frame-wise distillation to 1-2 steps. RAVEN attacks long-rollout history distribution...

Olympiad Gold Becomes a Two-Step Recipe

May 16, 2026

Olympiad Gold Becomes a Portable Two-Step Recipe. SU-01 combines reverse-perplexity curriculum SFT with two-stage RL. A 30B-A3B backbone clears IMO and IPhO...