Himanshu Sahni

September 18, 2018

OpenAI Five and the limits of self-play

A few weeks ago, OpenAI attempted a new major milestone in AI development, a (nearly) full game of Dota2 against some of the best human players. Although the OpenAI Five was defeated by both of its professional opponents, the level of play was high and at times the match looked fairly even. This is amazing as the full game of Dota2 is very complex. Even more incredibly, the agent was trained using a relatively simple and very general reinforcement learning algorithm, PPO.