具身智能ENACT

Our most recent work that benchmarks modern VLM and their efficacy for long horizon household activi...

ENACT基准评测VLM能否从机器人第一视角追踪家庭环境在行动后的状态变化

Fei-Fei Li @drfeifei2025年11月25日20 分钟阅读英文

Our most recent work that benchmarks modern VLM and their efficacy for long horizon household activi...

TL;DR: ENACT基准评测VLM能否从机器人第一视角追踪家庭环境在行动后的状态变化

以下为 Fei-Fei Li @drfeifei 原文（英文）

Our most recent work that benchmarks modern VLM and their efficacy for long horizon household activities in robotic learning, using BEHAVIOR benchmark environment.👇

Qineng Wang@qineng_wang

Most VLM benchmarks watch the world; few ask how actions change it from a robot's eye.

Embodied cognition tells us that intelligence isn't just watching – it's enacted through interaction.

👉We introduce ENACT: A benchmark that tests if VLMs can track the evolution of a home-scale environment from a robot's egocentric view.

enact-embodied-cognition.github.ioKkenact-embodied-cognition.github.io/enact.pdfMnD

1/N Your browser does not support the video tag.🔗 View on Twitter

🔗 View Quoted Tweet

💬31🔄66❤️462👀65263📊79 ⚡ Powered by xgo.ing

RoboRadar 智能评分

AI 相关度评分

60/100

低相关

AI 摘要

Qineng Wang在社交平台发布其最新工作，介绍名为ENACT的具身认知评测基准。该工作面向现代视觉语言模型（VLM），使用BEHAVIOR benchmark environment，关注机器人学习中的long horizon household activities。原文强调，许多VLM benchmark主要让模型“观看世界”，但较少考察模型是否理解机器人行动如何改变环境。作者借具身…

为什么重要

可用来提醒仓储机器人团队：评估多模态模型时应关注动作后状态追踪，而不只是画面识别

核心要点

01ENACT用于测试VLM能否从机器人第一视角追踪家庭尺度环境的状态演化。
02该工作使用BEHAVIOR benchmark environment，面向机器人学习中的long horizon household activities。
03原文将评测重点从“观看世界”转向“理解行动如何改变世界”，符合具身认知视角。

Boston Dynamics and Google DeepMind Teach Spot to Reason

IEEE Spectrum - Robotics · 11 天前

机

机器人财报里的“隐性成本”与“显性焦虑” - thepaper.cn

极智嘉 Geek+ 新闻 · 13 天前

机

机器人财报里的“隐性成本”与“显性焦虑” - 新浪财经

极智嘉 Geek+ 新闻 · 13 天前

更多相关 →