Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning

This repository contains the official implementation of Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning by Kwanyoung Park and Youngwoon Lee.

If you use this code for your research, please consider citing our paper:

@article{park2024tackling,
  title={Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning},
  author={Kwanyoung Park and Youngwoon Lee},
  journal={arXiv Preprint arxiv:2407.00699},
  year={2024}
}

How to run the code

Install dependencies

conda create -n LEQ python=3.9
conda activate LEQ

pip install -r requirements.txt

# Install jax (https://github.com/google/jax#pip-installation-gpu-cuda)
pip install jax[cuda]==0.4.8 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

# Install glew & others
conda install -c conda-forge glew
conda install -c conda-forge mesalib
conda install -c menpo glfw3
export CPATH=$CONDA_PREFIX/include
pip install patchelf

# Recover versions
pip install git+https://github.com/Farama-Foundation/d4rl@master#egg=d4rl
pip install numpy==1.23.0
pip install scipy==1.10.1

Also, see other configurations for CUDA here.

Pretrain world model

For training the world model, we use the training script of OfflineRL-Kit.

For convenience, we provide run_dynamics.py that can be utilized to train the model with OfflineRL-Kit.

cd ..
git clone https://github.com/yihaosun1124/OfflineRL-Kit
cd OfflineRL-Kit
python setup.py install
cp ../LEQ/dynamics/run_dynamics.py run_example/run_dynamics.py
cp -r ../LEQ/d4rl_ext .

Now, you can train the model with the run_dynamics.py. For example, you can run the command as below:

python run_example/run_dynamics.py --task antmaze-medium-replay-v2 --seed 3

Run training

LEQ

PYTHONPATH='.' python train/train_LEQ.py --env_name=walker2d-medium-replay-v2 --expectile 0.5

MOBILEQ (Please refer to the ablation study section of the paper for details)

PYTHONPATH='.' python train/train_MOBILEQ.py --env_name=Hopper-v3-medium --beta 1.0

MOBILE (Jax implementation of Sun et al.)

PYTHONPATH='.' python train/train_MOBILE.py --env_name=antmaze-large-play-v2 --beta 1.0

References

The implementation is based on IQL.
MOBILE implementation is from OfflineRLKit.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
algos		algos
configs		configs
d4rl_ext		d4rl_ext
docs		docs
dynamics		dynamics
train		train
wrappers		wrappers
LICENSE		LICENSE
README.md		README.md
common.py		common.py
dataset_utils.py		dataset_utils.py
evaluation.py		evaluation.py
policy.py		policy.py
requirements.txt		requirements.txt
run_dynamics.py		run_dynamics.py
value_net.py		value_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning

How to run the code

Install dependencies

Pretrain world model

Run training

LEQ

MOBILEQ (Please refer to the ablation study section of the paper for details)

MOBILE (Jax implementation of Sun et al.)

References

About

Releases

Packages

Contributors 2

Languages

License

kwanyoungpark/LEQ

Folders and files

Latest commit

History

Repository files navigation

Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning

How to run the code

Install dependencies

Pretrain world model

Run training

LEQ

MOBILEQ (Please refer to the ablation study section of the paper for details)

MOBILE (Jax implementation of Sun et al.)

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages