Agent Trajectories Let a 30B Match a 235B
- ACC Repackages Agent Tool-Use Trajectories as Long-Context QA Pairs. Qwen3-30B trained on them lifts MRCR from 50.2 to 68.3, matching Qwen3-235B-A22B at roughly 7x the parameters.
- WorldKV Moves Long-Term Memory Out of the Attention Bill. Retrieval plus per-block compression keeps the turn-around consistent, doubles throughput, no fine-tuning required.
- High-Res DiT Inference Is Shifting to Content-Aware Scaling. SEGA weights RoPE frequency components by spectral energy, dodging the structure-vs-detail tradeoff that uniform scaling forces.
- 80,870 Terminal Recordings Reverse-Engineered Into 1,530 Eval Tasks. TerminalWorld correlates with Terminal-Bench at Pearson 0.20, so scores from expert-curated sets may not map to real developer work.
Also Notable
- Flow Matching Belongs in DINOv2 Representation Space, Not Pixels or SD-VAE. Representation-space geometry is friendlier for flow matching to learn.
- Agentic Reasoning Shouldn't Make CoT Carry Planning Implicitly. The paper splits decisions into 3 systems so the agent explicitly chooses when to plan and when to act.
- SAM 2 Transferred Directly to Visual Object Tracking Isn't Enough. Adds motion, geometry, and semantic adapters to handle distractors, occlusion, and nonlinear motion.
- A Multi-Agent Pipeline for Short Drama Generation From One Sentence. Targets pacing, spatial consistency, and quality control as three specific pain points, not one giant prompt.
- Taylor Series Identifies "Temporal Surprise Points" in Video for Frame Selection. Training-free, aligned with predictive coding intuition.
- Model Search Is Fundamentally Comparative. Structured tables from model cards beat pure text similarity at separating candidate alternatives.
- A Task-Adaptive Unified Framework for Fashion Image Retrieval. Covers multiple query formats and search intents, directly applicable to e-commerce.
Don't miss what's next. Subscribe to AI Research Brief: