Embodied-Reasoner Outperforms OpenAI o1 in Embodied AI Tasks
A joint team from Zhejiang University, the Chinese Academy of Sciences, and Alibaba Damo Academy has open-sourced Embodied-Reasoner, a multimodal embodied reasoning model that achieved an 80.96% task success rate in the AI2-THOR simulator. This performance surpasses OpenAI o1 (71.73%), o3-mini (56.55%), and Claude-3.7 (67.70%). The model features advanced visual search, spatial reasoning, and self-correction capabilities achieved through an innovative three-stage training pipeline including imitation learning, self-exploration, and reflection tuning.
Embodied-Reasoner advances the state of the art in embodied AI, offering superior performance in interactive physical tasks. Its open-source nature and innovative training approach provide a strong foundation for future research and development in robotics and agents that interact with the real world.
Learn one new AI thing every day.
Daily Deck sends you seven plain-English cards like this every morning. Free.
Start free