AI论文简报

Archives
Log in
February 19, 2026

频谱衰减让W4A4量化回升7%精度

  • 预训练越充分,量化反而越脆弱:Amazon发现激活异常点严重程度与预训练规模正相关,S2D通过频谱衰减在训练阶段修复根因,W4A4精度最高回升7%
  • 精心挑选fine-tuning数据的大部分技巧没用,Microsoft Research系统拆解后发现只有梯度表示跨任务稳定有效,数据量充足时精选与随机几乎无差
  • token级策略梯度与推理的语义粒度根本错配。MPO把连续K个token打包为语义动作做策略梯度,让优化目标和推理结构对齐
  • Airbnb把地理检索重构为2500万网格的极端分类问题,在ranking之前就大幅收窄候选集,解决双边市场供需异质性带来的检索难题

阅读全文 →


  • Better pretraining makes quantization worse. Amazon finds activation outlier severity scales with pretraining duration. S2D applies spectral decay during training to fix the root cause, recovering up to 7% accuracy under W4A4.
  • Most fine-tuning data selection tricks are a waste of effort. Microsoft Research systematically disentangles the components and finds only gradient-based representations reliably predict downstream performance. With enough data, careful selection barely beats random.
  • Token-level policy gradients fundamentally mismatch reasoning granularity. MPO packs consecutive K tokens into semantic actions for policy gradient, aligning the optimization target with actual reasoning structure.
  • Airbnb reformulates geo-retrieval as extreme classification over 25 million grid cells. This drastically narrows the candidate set before ranking, tackling the heterogeneity problem inherent in two-sided marketplaces.

Read more →

Don't miss what's next. Subscribe to AI论文简报:
Powered by Buttondown, the easiest way to start and grow your newsletter.