vime-pytorch

This repo contains the PyTorch implementation of two Reinforcement Learning algorithms:

PPO (Proximal Policy Optimization) (paper)
VIME-PPO (Variational Information Maximizing Exploration) (paper)

The PPO implementation is mainly taken from ikostrikov/pytorch-a2c-ppo-acktr-gail.

The main novelty in this repository consists of the implementation of the VIME's exploration strategy using the PPO algorithm.

Requirements

In order to install requirements, follow:

pip install -r requirements.txt

If you don't have mujoco installed, follow the intructions here.

If having issues with OpenAI baselines, try:

# Baselines for Atari preprocessing
git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .

Instructions

In order to run InvertedDoublePendulum-v2 with VIME, you can use the following command:

python main.py --env-name InvertedDoublePendulum-v2 --algo vime-ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 --num-env-steps 1000000 --use-linear-lr-decay --no-cuda --log-dir /tmp/doublependulum/vimeppo/vimeppo-0 --seed 0 --use-proper-time-limits --eta 0.01

Instead, to run experiments with PPO, just replace vime-ppo with ppo.

Results

For standard gym environments, I used --eta 0.01.

For sparse gym environments, I used --eta 0.0001.

[the number in parenthesis represents how many experiments have been run]

Note:

Any gym-compatible environment can be run, but the hyperparameters have not been tested for all of them.

However, the parameters used with the InvertedDoublePendulum-v2 example in the Instructions are, generally, good enough for other mujoco environments.

TODO:

Integrate more args into the command line

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
algos		algos
dynamics		dynamics
envs		envs
misc		misc
models		models
results		results
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
main.py		main.py
play.py		play.py
requirements.txt		requirements.txt
visualize.ipynb		visualize.ipynb

License

mazpie/vime-pytorch

Folders and files

Latest commit

History

Repository files navigation

vime-pytorch

Requirements

Instructions

Results

Note:

TODO:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages