Our most recent work that benchmarks modern VLM and their efficacy for long horizon household activi...
ENACT基准评测VLM能否从机器人第一视角追踪家庭环境在行动后的状态变化

TL;DR: ENACT基准评测VLM能否从机器人第一视角追踪家庭环境在行动后的状态变化
Our most recent work that benchmarks modern VLM and their efficacy for long horizon household activities in robotic learning, using BEHAVIOR benchmark environment.👇
Qineng Wang@qineng_wang
Most VLM benchmarks watch the world; few ask how actions change it from a robot's eye.
Embodied cognition tells us that intelligence isn't just watching – it's enacted through interaction.
👉We introduce ENACT: A benchmark that tests if VLMs can track the evolution of a home-scale environment from a robot's egocentric view.
enact-embodied-cognition.github.ioKkenact-embodied-cognition.github.io/enact.pdfMnD
1/N Your browser does not support the video tag.🔗 View on Twitter
🔗 View Quoted Tweet
💬31🔄66❤️462👀65263📊79 ⚡ Powered by xgo.ing