Addition of Rainbow and Soft Actor-Critic to RL codebase
Hi! Maintaining mlpack’s tradition, this page contains weekly updates of my contributions at mlpack during Google Summer of Code(2020).
Here’s the output of the agents that I implemented this summer, solving some of the classical reinforcement learning environments.
I feel that these are one of the most in-demand and recent algorithms, whose implementation in mlpack was crucial.
Here’s the summary of what I’ve accomplished at the end of this summer.
- Improved the current QLearning implementation.
- Implemented Rainbow as an improvement on DQN. This includes adding the following as extensions:
- Dueling DQN
- Noisy DQN
- Categorical DQN
- N-step DQN
- Wrote test cases for each of the implementations, after tuning hyperparameters and testing each for several runs.
- Implemented Soft Actor-Critic (SAC) for continuous action space, along with its tests.
- Created detailed documentation for all the above implementations.
- Created documented Jupiter notebooks explaining solved examples of agents solving classical reinforcement learning problems, using a TCP API to communicate with an OpenAI gym instance.
The original project proposal can be found on the GSoC website here.
- Week 1 - Layout for Dueling and Noisy DQNs
- Week 2 and 3 - Finishing Dueling and Noisy DQNs
- Week 4 and 5 - Completed Multi-step DQN, C51 almost ready
- Week 6 and 7 - Training on gym_tcp_api, Layout for Soft-actor-critic
- Week 8 and 9 - Soft-Actor basic implementation complete, making solved example notebooks
- Week 10 and 11 - C51 merged, Soft-Actor-Critic almost complete, three new solved notebooks added, bug fixes 🐛🐛
- Week 12 - Wrapping up
Links to open and merged pull requests can be found here.
There has been a lot of coding, experimentation, thousands of builds and never ending debugging, so much that to think that it is coming to a wrap up doesn’t feel like reality. Special thanks to Marcus Edel, one of my mentors this year, who had the answers to almost all the problems I faced. With your and Rahul’s support, I had a very smooth and enjoyable experience, and I hardly ever got stuck anywhere.
I would also like to acknowledge the help from other members of the community, who were always available whenever needed, just a small chat away. I strongly intend to continue contributing to mlpack in all the ways I can, because it’s just too much fun. :)
Also, thanks to Google for this amazing opportunity and the generous funding.
This has been a summer worth remembering!