Meta-Reinforcement Learning Algorithms

A PyTorch implementation of meta-reinforcement learning algorithms, RL^2 PPO, MGRL, SNAIL, and VariBAD.

Setup

Install the packages using the requirements.txt file.

# using conda
conda create --name meta_rl --file requirements.txt
# Or pip
pip install requirements.txt

Usage

Run experiments by using the following example command:

python main.py --name experiment_name -c configs/rl2_ppo.yml

Algorithms

RL^2 Proximal Policy Optimization (PPO)
Meta-Gradient Reinforcement Learning (A2C)
VariBAD
SNAIL

Ideas for:

Proximal Policy Optimization with Episodic Planning Networks (EPNs)

Results

Initial results showing the convergence of meta-gradient reinforcement learning with A2C. The inner-loop optimizes on the CartPole environment and in the outer-loop the gamma value is cross-validated and updated by gradient descent. The current setting shows similar performance to the regular A2C algorithm and the implementation might benefit from conditioning the value and policy on gamma value embeddings.

References

Achiam, J. (2018). Spinning Up in Deep Reinforcement Learning. https://spinningup.openai.com/en/latest/index.html
Wang, J. X., Kurth-Nelson, Z., Tirumala, D., Soyer, H., Leibo, J. Z., Munos, R., Blundell, C., Kumaran, D., & Botvinick, M. (2017). Learning to reinforcement learn. ArXiv:1611.05763 [Cs, Stat]. http://arxiv.org/abs/1611.05763
Duan, Y., Schulman, J., Chen, X., Bartlett, P. L., Sutskever, I., & Abbeel, P. (2016). RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning (arXiv:1611.02779). arXiv. https://doi.org/10.48550/arXiv.1611.02779
Zintgraf, L., Shiarlis, K., Igl, M., Schulze, S., Gal, Y., Hofmann, K., & Whiteson, S. (2020). VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning (arXiv:1910.08348). arXiv. https://doi.org/10.48550/arXiv.1910.08348
Mishra, N., Rohaninejad, M., Chen, X., & Abbeel, P. (2018). A Simple Neural Attentive Meta-Learner (arXiv:1707.03141). arXiv. http://arxiv.org/abs/1707.03141
Xu, Z., van Hasselt, H., & Silver, D. (2018). Meta-Gradient Reinforcement Learning. CoRR, abs/1805.09801. Retrieved from http://arxiv.org/abs/1805.09801

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
algos		algos
assets		assets
configs		configs
envs		envs
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

algos

algos

assets

assets

configs

configs

envs

envs

utils

utils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

main.py

main.py

pyproject.toml

pyproject.toml

requirements.txt

requirements.txt

Repository files navigation

Meta-Reinforcement Learning Algorithms

Setup

Usage

Algorithms

Results

References

About

Releases

Packages

Contributors 2

Languages

License

RobvanGastel/meta-rl-algorithms

Folders and files

Latest commit

History

Repository files navigation

Meta-Reinforcement Learning Algorithms

Setup

Usage

Algorithms

Results

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages