频谱衰减让W4A4量化回升7%精度

        February 19, 2026

频谱衰减让W4A4量化回升7%精度

预训练越充分，量化反而越脆弱：Amazon发现激活异常点严重程度与预训练规模正相关，S2D通过频谱衰减在训练阶段修复根因，W4A4精度最高回升7%

精心挑选fine-tuning数据的大部分技巧没用，Microsoft Research系统拆解后发现只有梯度表示跨任务稳定有效，数据量充足时精选与随机几乎无差

token级策略梯度与推理的语义粒度根本错配。MPO把连续K个token打包为语义动作做策略梯度，让优化目标和推理结构对齐

Airbnb把地理检索重构为2500万网格的极端分类问题，在ranking之前就大幅收窄候选集，解决双边市场供需异质性带来的检索难题

阅读全文 →

Better pretraining makes quantization worse. Amazon finds activation outlier severity scales with pretraining duration. S2D applies spectral decay during training to fix the root cause, recovering up to 7% accuracy under W4A4.

Most fine-tuning data selection tricks are a waste of effort. Microsoft Research systematically disentangles the components and finds only gradient-based representations reliably predict downstream performance. With enough data, careful selection barely beats random.

Token-level policy gradients fundamentally mismatch reasoning granularity. MPO packs consecutive K tokens into semantic actions for policy gradient, aligning the optimization target with actual reasoning structure.

Airbnb reformulates geo-retrieval as extreme classification over 25 million grid cells. This drastically narrows the candidate set before ranking, tackling the heterogeneity problem inherent in two-sided marketplaces.

Read more →

                                Don't miss what's next. Subscribe to AI论文简报:

            Email address (required)