Paper | Project Page | Video
SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation
Hang Yin*, Xiuwei Xu*$^\dagger$ , Zhenyu Wu, Jie Zhou, Jiwen Lu$^\ddagger$
* Equal contribution
We propose a zero-shot object-goal navigation framework by constructing an online 3D scene graph to prompt LLMs. Our method can be directly applied to different kinds of scenes and categories without training. 中文解读.
- [2024/12/30]: We update the code and simplify the installation.
- [2024/09/26]: SG-Nav is accepted to NeurIPS 2024!
Demos are a little bit large; please wait a moment to load them. Welcome to the home page for more complete demos and detailed introductions.
Step 1 (Dataset)
Download Matterport3D scene dataset and object-goal navigation episodes dataset from here.
Set your scene dataset path SCENES_DIR
and episode dataset path DATA_PATH
in config file configs/challenge_objectnav2021.local.rgbd.yaml
.
The structure of the dataset is outlined as follows:
MatterPort3D/
├── mp3d/
│ ├── 2azQ1b91cZZ/
│ │ └── 2azQ1b91cZZ.glb
│ ├── 8194nk5LbLH/
│ │ └── 8194nk5LbLH.glb
│ └── ...
└── objectnav/
└── mp3d/
└── v1/
└── val/
├── content/
│ ├── 2azQ1b91cZZ.json.gz
│ ├── 8194nk5LbLH.json.gz
│ └── ...
└── val.json.gz
Step 2 (Environment)
Create conda environment with python==3.9.
conda create -n SG_Nav python==3.9
Step 3 (Simulator)
Install habitat-sim==0.2.4 and habitat-lab.
conda install habitat-sim==0.2.4 -c conda-forge -c aihabitat
pip install -e habitat-lab
Then replace the agent/agent.py
in the installed habitat-sim package with tools/agent.py
in our repository.
HABITAT_SIM_PATH=$(pip show habitat_sim | grep 'Location:' | awk '{print $2}')
cp tools/agent.py ${HABITAT_SIM_PATH}/habitat_sim/agent/
Step 4 (Package)
Install pytorch<=1.9, pytorch3d and faiss. Install other packages.
conda install -c pytorch faiss-gpu=1.8.0
pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
pip install "git+https://github.com/facebookresearch/pytorch3d.git"
Install Grounded SAM.
pip install -e segment_anything
pip install --no-build-isolation -e GroundingDINO
wget -O segment_anything/sam_vit_h_4b8939.pth https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget -O GroundingDINO/groundingdino_swint_ogc.pth https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
Install GLIP model and download GLIP checkpoint.
cd GLIP
python setup.py build develop --user
mkdir MODEL
cd MODEL
wget https://huggingface.co/GLIPModel/GLIP/resolve/main/glip_large_model.pth
cd ../../
Install Ollama.
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2-vision
Run SG-Nav:
python SG_Nav.py --visualize
@article{yin2024sgnav,
title={SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation},
author={Hang Yin and Xiuwei Xu and Zhenyu Wu and Jie Zhou and Jiwen Lu},
journal={arXiv preprint arXiv:2410.08189},
year={2024}
}