← Library · Frontier

ZJU, CAS, and Alibaba Open-Source Embodied-Reasoner, Outperforming OpenAI o1

A joint team from Zhejiang University, Chinese Academy of Sciences, and Alibaba Damo Academy has open-sourced Embodied-Reasoner, a multimodal embodied reasoning model. This model achieved an 80.96% task success rate in AI2-THOR simulations, surpassing OpenAI o1 (71.73%), o3-mini (56.55%), and Claude-3.7 (67.70%). Its superior performance comes from a three-stage training pipeline involving imitation learning, self-exploration with rejection sampling, and self-correction through reflection tuning. Embodied-Reasoner also demonstrated strong performance in real-world object search tasks.

Why it matters

Embodied-Reasoner advances the state-of-the-art in embodied AI, offering a fully open-source solution that improves spatial reasoning, search efficiency, and self-correction capabilities in interactive physical tasks. This could accelerate the development of more capable and reliable AI agents for real-world applications.

Learn one new AI thing every day.

Daily Deck sends you seven plain-English cards like this every morning. Free.

Start free