A new way to get started with Deep Reinforcement Learning

Every time I read an OpenAI blog post I'm inspired.

I'm inspired by the artwork, I'm inspired by the quality of work, I'm inspired by the communication skills.

They work on hard problems. Get great results. And show them off to the world in a way easily accessible to the world.

So when their post came out about Spinning Up, a free resource for learning Deep Reinforcement Learning (RL), I wasn't surprised. Not surprised by the fact I loved the artwork, by simple and effective communication in the blog post and the world class quality of the learning materials.

Seriously. This is something I'd put on my wall (stay tuned, this might happen).

 I’d love whoever designs OpenAI’s graphics to come and paint my room. Source:  OpenAI Blog .

I’d love whoever designs OpenAI’s graphics to come and paint my room. Source: OpenAI Blog.

More to the point, if you're interested in Deep RL, you should leave here and check out Spinning Up.

If you're not interested, it's worth knowing a little about it anyway.

Deep RL involves the training of an agent in an environment to learn something about the world.

This is best explained with an example.

Deep RL was the technique DeepMind used to build Alpha Zero. The best Go player in the world.

In Alpha Zero's case, it (the agent) learned to play Go by continually playing itself. It got to a superhuman level exceptionally fast because it was able to play many simulated games of Go (the environment) against itself at once.

The deep part comes from using many versions of itself to learn. So really, there were multiple agents (different versions of Alpha Zero, converging into one) and multiple environments (many different games of Go).

How about another example?

Let's say you're a doctor who wants to try a new treatment on your patients. But you're hesitant because you're not sure how it will go.

So you decide to wait for more trials to take place.

But trials are lengthy, expensive and potentially harmful to those involved.

What if you could simulate a treatment (the agent) and test it on many different simulated patients with characteristics similar to yours (the environment) and see what the outcomes were?

Using the knowledge you gather, you could potentially find an ideal treatment for each individual patient.

This example simplifies the process dramatically but the principles of Deep RL are there.

The beautiful thing is, Deep RL could potentially be harnessed for any problem where many different scenarios need to be accounted for.

Say you wanted to improve traffic lights to find a more ideal schedule, Deep RL could be used to test many different car arrivals (the environment) and many different light configurations (the agents) with the goal of maximising efficiency.

The scenarios are endless.

I know there's a way to somehow use Deep RL for health. But I don't know how (yet).

To help figure this out, I'm incorporating the Spinning Up materials into my curriculum for 2019. I can't wait.

And of course, I'll be sharing what I learn.

If you're interested in getting started with Deep RL, there are a few resources you might want to check out:

There's also a workshop OpenAI are hosting on February 2 at their San Francisco HQ for those who have tinkered with machine learning and are wanting to learn more about Deep RL. I applied and you can too (applications close December 8).