GitHub - yuanyaaa/InCLET: Official Code Repository for 《InCLET: Large Language Model In-context Learning can Improve Embodied Instruction-following》

InCLET: Large Language Model In-context Learning can Improve Embodied Instruction-following

Abstract: Natural language-conditioned reinforcement learning (NLC-RL) empowers embodied agent to complete various tasks following human instruction. However, the unbounded natural language examples still introduce much complexity for the agent that solves concrete RL tasks, which can distract policy learning from completing the task. Consequently, extracting effective task representation from human instruction emerges as the critical component of NLC-RL. While previous methods have attempted to address this issue by learning task-related representation using large language models (LLMs), they highly rely on pre-collected task data and require extra training procedure. In this study, we uncover the inherent capability of LLMs to generate task representations and present a novel method, in-context learning embedding as task representation (InCLET). InCLET is grounded on a foundational finding that LLM in-context learning using trajectories can greatly help represent tasks. We thus firstly employ LLM to imagine task trajectories following the natural language instruction, then use in-context learning of LLM to generate task representations, and finally aggregate and project into a compact low-dimensional task representation. This representation is then used to train a human instruction following agent. We conduct experiments on various embodied control environments and results show that InCLET creates effective task representations. Furthermore, this representation can significantly improve the RL training efficiency, compared to the baseline methods.

Setup

Please finish the following steps to install conda environment and related python packages
- Package install
```
pip install -r requirements.txt
```
The environments used in this work require MuJoCo, CLEVR-Robot Environment and LLM as dependecies. Please setup them following the instructions:
- Instructions for MuJoCo: https://mujoco.org/
- Instructions for CLEVR-Robot Environment: https://github.com/google-research/clevr_robot_env
- Llama3-8B: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

Using

Training goal-conditioned-policy of InCLET:

Before training the Goal-Conditioned-Policy (GCP), we need to train a TL translator using the process described in step 4. When the traning of TL translator model is completed, please place the model in the designated location:

<project_path>/models/

Before beginning the training process, please ensure that you have downloaded Llama3-8B from https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct and change the path in file:

algorithms/translation/llm_encoder.py

FrankaKitchen
- Instruction following policy: Run this command in shell
```
python kitchen_train.py --seed <SEED>
```
- The models of goal-conditioned-policy will be saved at kitchen_model.
- The tensorboard log of goal-conditioned-policy will be saved at kitchen_train.
- The evaluation result of goal-conditioned-policy will be saved at kitchen_callback.
CLEVR-Robot
- Instruction following policy: Run this command in shell
```
python ball_train.py --seed <SEED>
```
- The models of goal-conditioned-policy will be saved at ball_model.
- The tensorboard log of goal-conditioned-policy will be saved at ball_train.
- The evaluation result of goal-conditioned-policy will be saved at ball_callback.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
GCP_utils		GCP_utils
algorithms		algorithms
ball_callback		ball_callback
envs		envs
kitchen		kitchen
stable_baselines3		stable_baselines3
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
ball_train.py		ball_train.py
kitchen_embedding_plot.py		kitchen_embedding_plot.py
kitchen_train.py		kitchen_train.py
llm_generate_ball.py		llm_generate_ball.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

InCLET: Large Language Model In-context Learning can Improve Embodied Instruction-following

Setup

Using

Training goal-conditioned-policy of InCLET:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

yuanyaaa/InCLET

Folders and files

Latest commit

History

Repository files navigation

InCLET: Large Language Model In-context Learning can Improve Embodied Instruction-following

Setup

Using

Training goal-conditioned-policy of InCLET:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages