AI Research Brief

Archives
March 10, 2026

12k Samples Beat Finance SOTA, CUDA Optimization 35% Faster

  • Post-Training Data Matters More Than Model Size in Vertical Domains. A systematic ablation in finance shows that distillation quality control plus difficulty-aware sampling lets an 8B model beat same-scale SOTA with just 12k RL samples.
  • Offline RL Turns Agent Planning From Guesswork Into Engineering. Microsoft trains tool-call planning on synthetic trajectories with quality scoring. The approach transfers to any multi-step agent task.
  • Models Shouldn't Be Locked to Fixed Weights After Deployment. Tencent's HY-WU introduces a functional memory module that generates instance-level weight updates in real time, skipping test-time optimization overhead.
  • LLM CUDA Kernel Optimization Expands to General HPC. A new benchmark, MSKernelBench, covers four task categories. A multi-agent architecture runs 35% faster than existing methods overall.

Also Notable

  • RL Agent Autonomously Runs Architecture Search Until Convergence. Bold idea, but validation scale is still small.
  • Activation Steering Controls Endoscopy Pathological Features Without Training or Fine-Tuning. Generates causal training data inside diffusion models.
  • RLVR Reasoning Chains Are Full of Redundant Steps; Re-Solving Sends Models Back to Key Nodes. Both efficiency and quality improve (ICLR).
  • Slide Auto-Generation Finally Gets a Fine-Grained Rubric Benchmark. Covers layout, content, and visual consistency.
  • Mila's Planet-Scale 4D Spatiotemporal World Model. Extends multi-resolution hash encoding into time for self-supervised representations across centuries and continents.
  • Long Video Understanding Has a Credibility Problem: VLMs Answer Confidently With Key Frames Missing. Evaluation scores are inflated (CVPR).
  • RAG Applied to Gene Perturbation Response Prediction. Cross-cell-type generalization significantly outperforms pure deep learning methods (ICLR).
  • Conformal Prediction Meets Generative Molecular Design. Statistical guarantees without an oracle (ICLR).

Read the full edition →

Don't miss what's next. Subscribe to AI Research Brief:
Powered by Buttondown, the easiest way to start and grow your newsletter.