OpenAI Gym Gives Reinforcement Learning A Work Out

Written by Mike James

Friday, 29 April 2016

When OpenAI, an open source AI initiative backed by Elon Musk, Sam Altman and Ilya Sutskever, was announced earlier in the year, I doubt anyone expected anything to be produced so quickly and certainly not something connected with reinforcement learning. OpenAI Gym is what it sounds like - an exercise facility for reinforcement learning.

openailogo

Since the success of Deep Mind's Deep Q learning at playing games, and Go in particular, the subject of reinforcement learning (RL) has gone from an academic backwater to front line AI.

The big problem is that reinforcement learning is a difficult technique to characterise. Put simply an RL system learns not by being told how close it is the the desired result, but by receiving rewards based on its behaviour. Of course this is largely how we learn and if it can be made to work efficiently it promises us not just effective AI but new knowledge. For example AlphaGo taught itself to play Go and in the process discovered for itself approaches to Go that humans had ignored.

OpenAI claims that the things are holding RL back:

The need for better benchmarks. In supervised learning, progress has been driven by large labeled datasets like ImageNet. In RL, the closest equivalent would be a large and diverse collection of environments. However, the existing open-source collections of RL environments don't have enough variety, and they are often difficult to even set up and use.
Lack of standardization of environments used in publications. Subtle differences in the problem definition, such as the reward function or the set of actions, can drastically alter a task's difficulty. This issue makes it difficult to reproduce published research and compare results from different papers.

The motivation behind OpenAI Gym is to provide a set of environments that different RL programs can be tested in. These are:

Classic control and toy text: complete small-scale tasks, mostly from the RL literature. These are the ones you read about in the literature - pole balancing and similar.
Algorithmic: perform computations such as adding multi-digit numbers and reversing sequences.
Atari: play classic Atari games.
Board games: play Go on 9x9 and 19x19 boards. In this release there is a fixed opponent based on a good algorithmic method.
2D and 3D robots: control a robot simulation using accurate physics.

openai

At the moment you can connect your RL system to the gym using Python. Of course it is up to map the RL system onto the environment - as the documentation says:

"We provide the environment; you provide the algorithm.

You can write your agent using your existing numerical computation library, such as TensorFlow or Theano."

The idea is to collect and curate a set of results that indicate how well the different approaches are doing at generalizing their results.

It is good to see that an open source initiative is doing something other than simply reproducing what is being done in the closed software world. It would be very easy for OpenAI to simply build its own Tensorflow or an alternative, but OpenAI Gym is novel and needed.

openailogo

More Information

OpenAI Gym

https://github.com/openai/gym

AI Goes Open Source To The Tune Of $1 Billion

GNU Gneural Network - Do We Need Another Open Source DNN?

Google's DeepMind Demis Hassabis Gives The Strachey Lecture

AlphaGo Beats Lee Sedol Final Score 4-1

Why AlphaGo Changes Everything

Google's DeepMind Learns To Play Arcade Games

Microsoft Wins ImageNet Using Extremely Deep Neural Networks

The Flaw In Every Neural Network Just Got A Little Worse

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook, Google+ or Linkedin.

Agentic AI For PostgreSQL
21/07/2025

Agentic AI and MPC are coming to the database. We examine two options that will allow you to diagnose and tweak PostgreSQL, the modern way.

+ Full Story

Apache Arrow 21 Released
07/07/2025

Version 21 of Apache Arrow has been released, including the first official Swift implementation of the platform. Improvements to Arrow 21 include exposing gRPC in the Flight client builder and improve [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 12 July 2023 )

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments