AI论文简报

Archives
Log in
February 12, 2026

文本扩散模型不再只是概念验证

  • 文本扩散模型不再只是概念验证,LLaDA2.1的100B模型在代码任务上跑出892 TPS,并首次实现大规模RL训练dLLM
  • 开源视频+音频联合生成终于有了,MOVA一个模型同时生成画面、对白、音效和音乐
  • GUI Agent三个变体覆盖2B到30B。UI-Venus-1.5在ScreenSpot-Pro和AndroidWorld上刷新SOTA,中文手机App实测可用
  • 模型训练到瓶颈了怎么办?用自己的"弱版本"当老师反而能继续涨点,零额外推理开销

阅读全文 →


  • Text diffusion models are no longer a proof of concept. LLaDA2.1's 100B model hits 892 TPS on code tasks and is the first dLLM to undergo large-scale RL training.
  • Open-source video+audio joint generation is here. MOVA generates visuals, dialogue, sound effects, and music in a single model.
  • GUI agents that actually work on real phones. UI-Venus-1.5 sets new SOTA on ScreenSpot-Pro and AndroidWorld across three model sizes from 2B to 30B.
  • When post-training saturates, teach from your own weak checkpoints. WMSS uses earlier model states to recover forgotten capabilities at zero inference cost.

Read more →

Don't miss what's next. Subscribe to AI论文简报:
Powered by Buttondown, the easiest way to start and grow your newsletter.