文本扩散模型不再只是概念验证

        February 12, 2026

文本扩散模型不再只是概念验证

文本扩散模型不再只是概念验证，LLaDA2.1的100B模型在代码任务上跑出892 TPS，并首次实现大规模RL训练dLLM

开源视频+音频联合生成终于有了，MOVA一个模型同时生成画面、对白、音效和音乐

GUI Agent三个变体覆盖2B到30B。UI-Venus-1.5在ScreenSpot-Pro和AndroidWorld上刷新SOTA，中文手机App实测可用

模型训练到瓶颈了怎么办？用自己的"弱版本"当老师反而能继续涨点，零额外推理开销

阅读全文 →

Text diffusion models are no longer a proof of concept. LLaDA2.1's 100B model hits 892 TPS on code tasks and is the first dLLM to undergo large-scale RL training.

Open-source video+audio joint generation is here. MOVA generates visuals, dialogue, sound effects, and music in a single model.

GUI agents that actually work on real phones. UI-Venus-1.5 sets new SOTA on ScreenSpot-Pro and AndroidWorld across three model sizes from 2B to 30B.

When post-training saturates, teach from your own weak checkpoints. WMSS uses earlier model states to recover forgotten capabilities at zero inference cost.

Read more →

                                Don't miss what's next. Subscribe to AI论文简报:

            Email address (required)