Can an Embodied Agent Find Your Cat-shaped Mug? LLM-Based Zero-Shot Object Navigation


Overview

We present LGX, a novel algorithm for Object Goal Navigation in a language-driven, zero-shot manner, where an embodied agent navigates to an arbitrarily described target object in a previously unexplored environment. Our approach leverages the capabilities of Large Language Models (LLMs) for making navigational decisions by mapping the LLMs implicit knowledge about the semantic context of the environment into sequential inputs for robot motion planning. We conduct experiments both in simulation and real world environments, and showcase factors that influence the decision making capabilities of LLMs for zero-shot navigation.

Video

Paper

Can an Embodied Agent Find Your Cat-shaped Mug? LLM-Based Zero-Shot Object Navigation.
Vishnu Sashank Dorbala, James F. Mullen Jr., Dinesh Manocha

Code

Code can be found here.