Know an awesome toolkit in reinforcement learning? Tell us through the chat button on bottom right!



ChainerRL is a deep reinforcement learning library that implements various state-of-the-art deep reinforcement algorithms in Python using Chainer, a flexible deep learning framework.



ELF is an Extensive, Lightweight, and Flexible platform for game research. We have used it to build our Go playing bot, ELF OpenGo, which achieved a 14-0 record versus four global top-30 players in April 2018. The final score is 20-0 (each professional Go players play 5 games).


keras-rl implements some state-of-the art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library Keras. Furthermore, keras-rl works with OpenAI Gym out of the box. This means that evaluating and playing around with different algorithms is easy.

OpenAI Baselines


OpenAI Baselines is a set of high-quality implementations of reinforcement learning algorithms. These algorithms will make it easier for the research community to replicate, refine, and identify new ideas, and will create good baselines to build research on top of. Our DQN implementation and its variants are roughly on par with the scores in published papers. We expect they will be used as a base around which new ideas can be added, and as a tool for comparing a new approach against existing ones.


Modularized implementation of popular deep RL algorithms by PyTorch. Easy switch between toy tasks and challenging games.


TensorForce is an open source reinforcement learning library focused on providing clear APIs, readability and modularisation to deploy reinforcement learning solutions both in research and practice. TensorForce is built on top of TensorFlow and compatible with Python 2.7 and >3.5 and supports multiple state inputs and multi-dimensional actions to be compatible with any type of simulation or application environment.



TRFL (pronounced "truffle") is a library built on top of TensorFlow that exposes several useful building blocks for implementing Reinforcement Learning agents. We formulate common RL update rules for these neural networks as differentiable loss functions, as is common in (un-)supervised learning. We find that loss functions are more modular and composable than traditional RL updates, and more natural when combining with supervised or unsupervised objectives.



Reinforcement learning framework and algorithms implemented in PyTorch.