医疗AI、科学Agent、机器人VLA:垂直领域正在成为AI的主战场
- 医疗多模态模型开始超越GPT-4o级闭源系统,MedXIAOHE用实体感知预训练+RL推理训练打通了从罕见病到长报告生成的全链路
- 小米开源机器人VLA模型,消费级GPU上实现实时双臂操控,从训练到部署的异步执行设计是关键
- 科学工具调用是Agent的硬伤。SciAgentGym造了1780个领域工具做压测,8B模型微调后反超235B
- RL微调让VLM的benchmark分数上去了,但推理链的"忠实度"却在下降——准确率和可靠性的trade-off浮出水面
- A medical multimodal model now outperforms GPT-4o-class closed-source systems. MedXIAOHE chains entity-aware pretraining with RL-based reasoning to cover everything from rare diseases to long-form report generation.
- Xiaomi open-sources a robot VLA model that runs real-time bimanual manipulation on a consumer GPU. The key is asynchronous execution baked into training, not just deployment.
- Scientific tool use is agents' Achilles' heel. SciAgentGym stress-tests 1,780 domain tools — and an 8B fine-tuned model beats a 235B general-purpose one.
- RL fine-tuning boosts VLM benchmark scores, but chain-of-thought faithfulness degrades — surfacing a hidden accuracy-vs-reliability trade-off.
Don't miss what's next. Subscribe to AI论文简报: