AI Research Brief

Archives
Log in
May 25, 2026

Agent Trajectories Let a 30B Match a 235B

  • ACC Repackages Agent Tool-Use Trajectories as Long-Context QA Pairs. Qwen3-30B trained on them lifts MRCR from 50.2 to 68.3, matching Qwen3-235B-A22B at roughly 7x the parameters.
  • WorldKV Moves Long-Term Memory Out of the Attention Bill. Retrieval plus per-block compression keeps the turn-around consistent, doubles throughput, no fine-tuning required.
  • High-Res DiT Inference Is Shifting to Content-Aware Scaling. SEGA weights RoPE frequency components by spectral energy, dodging the structure-vs-detail tradeoff that uniform scaling forces.
  • 80,870 Terminal Recordings Reverse-Engineered Into 1,530 Eval Tasks. TerminalWorld correlates with Terminal-Bench at Pearson 0.20, so scores from expert-curated sets may not map to real developer work.

Also Notable

  • Flow Matching Belongs in DINOv2 Representation Space, Not Pixels or SD-VAE. Representation-space geometry is friendlier for flow matching to learn.
  • Agentic Reasoning Shouldn't Make CoT Carry Planning Implicitly. The paper splits decisions into 3 systems so the agent explicitly chooses when to plan and when to act.
  • SAM 2 Transferred Directly to Visual Object Tracking Isn't Enough. Adds motion, geometry, and semantic adapters to handle distractors, occlusion, and nonlinear motion.
  • A Multi-Agent Pipeline for Short Drama Generation From One Sentence. Targets pacing, spatial consistency, and quality control as three specific pain points, not one giant prompt.
  • Taylor Series Identifies "Temporal Surprise Points" in Video for Frame Selection. Training-free, aligned with predictive coding intuition.
  • Model Search Is Fundamentally Comparative. Structured tables from model cards beat pure text similarity at separating candidate alternatives.
  • A Task-Adaptive Unified Framework for Fashion Image Retrieval. Covers multiple query formats and search intents, directly applicable to e-commerce.

Read the full edition →

Don't miss what's next. Subscribe to AI Research Brief:
Powered by Buttondown, the easiest way to start and grow your newsletter.