Using Deep Reinforcement Learning to Play Sonic the Hedgehog

In between projects over the past couple of weeks, the Max Kelsen team and I attempted to replicate the epic World Models architecture to play Sonic the Hedgehog for the OpenAI Retro Contest.

Although our algorithm wasn't fully ready by the submission date, we still learned a great deal in the process.

The full writeup of our approach is available on Medium.

Twitch vs Facebook for livestreaming, building and launching an app from scratch and conversations with robots

Sometimes you pour your heart and soul into something and then no one cares about it. It's a shitty feeling.

It takes courage to make something. To put it out into the world. To know someone might not like it but to do it anyway.

Creation is freedom. But too often we're snowed under the amount of information we have access to. A quick browse of the internet and it seems everyone is after your time and attention.

I can't talk, I make heaps of things. I want people to read my words. I want people to watch my videos. I'm grateful for the people who do but it doesn't bother me if they don't. I'm addicted to the feeling of tying ideas in my head together and making something out of nothing.

I've been trying to instil this creative mindset into my brothers. I want them to experience the same feeling I feel when I make something. Everyone should feel this feeling.

One of my brothers recently started livestreaming himself playing Fornite. We had to explain what livestreaming was to our parents. They had no idea it was even possible, let alone understand what it meant. We told our Dad it was the same thing as watching a rugby game live but others are watching him play video games. He got it. Sort of.

Will started livestreaming on Twitch for a month before moving to Facebook. Facebook is really pushing videos so his numbers went through the roof. Facebook is still the king when it comes to content distribution.

My other brother Josh launched his first app last month. When I was his age I had hopes of making my own app. He beat me to it. I failed to execute. No excuses. I failed. I asked him, what made you start building an app? He replied, "I wanted to see how hard it was to build an app, I thought of something simple and then said let's execute."

I want to live my life like that. Thinking of something and then executing on it. Friction annoys me. If rocks fall down and block my path, I clear them out or climb over them.

It's natural to want to remove obstacles. We subconsciously seek out comfort. Our energy is precious. We're still in the mindset of needing to conserve it or else we'll die.

My brothers streaming numbers went up on Facebook because the friction of leaving the platform for Twitch was taken away. Instead of clicking a link and watching, they can watch him play right in their newsfeed.
If something takes more than 1-step, we avoid it. I do this.

You can have the best product in the world but if you don't have shelf space, it's useless. Facebook's newsfeed is the shelf space for my brother's stream. It's like the stack of specials you see as you walk into the supermarket. Trying to get people to look elsewhere (linking to Twitch) is like getting them to walk to the back of the store. If they need milk, they'll walk to the back. If not, they won’t.

The internet is a beautiful place. We can entertain ourselves with almost any form of content we like. And the platforms we use are getting better at continually serving us similar kinds of content. I haven’t searched for a YouTube video in a long time. But it’s not just a consumption paradise, we can also use it to learn almost anything we want.

Josh built his app after doing half an iOS course on Udemy. And I’ve been learning artificial intelligence and machine learning through only online resources.

There's no shortage of education, only a shortage of willingness to learn.

There's even more of a shortage of sharing what you learn. Everyone can be a teacher. Everyone should be a teacher. We could all learn something off each other. You learn something, you teach it to someone else, it cements your knowledge and helps them learn too. It creates a circle. Everyone wins.

We finished up the episode talking about Google's latest breakthrough. Google Duplex. It's an AI-powered personal assistant who's able to make phone calls on your behalf. It sounds like a person. It even makes very human-ish sounds during the conversation. Whilst talking to a hairdresser, it casual dropped a ‘mhmmmm’. It's the beginning of the movie Her. Perhaps I'll be able to have a conversation with my Alexa soon.

Robots holding real conversations, learning to code and launching your app after $7.50 worth of an online course and broadcasting yourself playing video games to hundreds of people live online, we live in a crazy world. I love it.

See you next week. Hopefully, we'll have half the technical difficulties.

PS good luck getting past 77 on Josh’s app. Our little brother, Sam, currently has the high score. Send me a message if you manage to beat it and I’ll give you a shoutout.

Links mentioned in the show:

Listen to the show:

Watch the show:




How I'm Learning Deep Learning - Part IV

Is a wealth of data the final frontier?


This article is part of the How I’m Learning Deep Learning Series on Medium:

Part I: A new beginning.
Part II: Learning Python on the fly.
Part III: Too much breadth, not enough depth.
Part IV: AI(ntuition) versus AI(ntelligence). (You’re currently reading this)
Extra: My Self-Created AI Master’s Degree

A few of the resources helping me break into the world of AI. — Hinton Image  Source

A few of the resources helping me break into the world of AI. — Hinton Image Source

A lot has happened since Part III. While the last couple of articles went in-depth into what exactly I was learning, this one will be a little different. Rather than break it down week by week, I’ll cover the major milestones.

I graduated from the Udacity Deep Learning Nanodegree (DLND) in August last year. Thinking about how I emailed the support team asking what the refund policy was before starting the course makes me laugh. It was easily one of the best learning programs I’ve ever taken. If you’re after more details, I recently published an in-depth review video on the DLND.

Making videos about my journey has led to some great conversations with others on the same path. I met someone in Canada who was doing almost the exact same courses as me. Even more interesting is that we have the same poster of Arnold in our room. Small world.

More recently, I had the opportunity to have a conversation with Shaik Asad, a 14-year-old AI developer from India. He teaches himself AI after completing his homework. Since then we’ve been actively chatting about life and our other interests. Seeing how passionate Shaik is about AI and hearing what his goals are, inspired is an understatement.

AI Master’s Degree

After graduating from the DLND, I was a deer in the headlights. I’d learned all about the amazing power of deep learning (DL) but still didn’t fully understand what really made deep neural nets tick. I was also left wondering whether DL is the be all and end all of Artificial Intelligence (more on this later).

I needed to know more. Curiosity led me to create my own AI Master’s Degree. Having a rough outline of a curriculum to follow allowed me to narrow down how I would spend my time. My mission is to use AI to help people move more and eat better. I have skin in the game in the world of fitness and nutrition, I’m working on the AI side.

In the past few months, I’ve completed 80% of the Coursera Deep Learning specialisation (course 5 was just released as of writing) by Andrew Ng and the team at as well as Term 1 of the Udacity Artificial Intelligence Nanodegree (AIND).

For those who learn from a ground-up approach, the specialisation is the best place to start learning about DL. If you’re more into diving into project building, or want to progress with one of Udacity’s advanced Nanodegrees, start with the DLND.

Term 1 of the AIND covered classical AI approaches. It lost me at times due to my lack of programming ability and my recent focus on DL. However, learning about how far the field has come since inception was fascinating.

I’m into Term 2 now, which includes building projects in computer vision, natural language processing and speech recognition using DL. Back in a familiar fishbowl.

I’ll release a full in-depth review of the and AIND once I’ve completed them both.

Future of AI

After learning more about how DL works, I started to become suspicious of what its longevity prospects are. Many DL models need a ridiculous amount of data to produce a useful output.

This is well and good if you’re one of the two companies in the world with enough data to keep Titanic afloat but not so good if you’re a young AI hopeful. Deep Learning has brought about many incredible insights but many of which are in the realm of supervised learning, which still takes a lot of human input.

Although our ability to gather and produce data is increasing exponentially, I’m not convinced more data is the key to solving all of our AI problems.

Are we really just data-processing machines? Last century, people thought our internal processes could be modelled using the concept of steam engines. The man with a hammer problem comes to mind.

Back-propagation (an algorithm to help neural networks improve themselves) does not work very well on unlabelled data, which is what most of the universe is comprised of.

Consider a four-year-old walking into a room they’ve never been in. The young child doesn’t require 10,000 labelled images of a room to know how to navigate it. They don’t even require one labelled image of a room, they simply start interacting with it.

Even the godfather of Deep Learning seems to be thinking along the same lines. In an interview late last year, Geoffrey Hinton was asked his opinion on the current state of AI techniques.

“My view is throw it all away and start again.” 
 “The future depends on some graduate student who is deeply suspicious of everything I have said.”

Listening to the lectures and talks of Monica Anderson (especially the one on dual-process theory) and discovering her work on Artificial Intuition as an approach to Artificial Intelligence raises more questions on the matter.

I will delve deeper into these topics in a future post.

Next Steps

Over the next few months, I will be completing the curriculum I have set out for myself.

I just submitted the first major project of Term 2 of the AIND, a computer vision model to detect facial keypoints.

For each of the upcoming major projects, I will be posting an article detailing my understanding of the work as well as a step-by-step guide for those looking to build an equivalent.

I’m also thoroughly enjoying the free courses on offer from MIT on Artificial General Intelligence and Deep-Learning for Self-Driving Cars.

After completing the AIND, seems likely to be my next port of call.

By the time I finish up my curriculum, I will be looking to move to the US to join a startup in the world of health and AI (or create my own). If you know anyone or think I should be following anyone currently playing at the crossroads of health and AI, please let me know.

For those considering embarking on their own self-led learning journey or finding out more about AI, the words of Naval Ravikant sum it up perfectly.

The current education system is a path depended outcome. We have the internet now, if you actually have a desire to learn, everything is on the internet. The ability, means and tools to learn are abundant and infinite, it’s the desire to learn that’s incredibly scarce.

See you in Part V.


Two New Videos Today!

Today I posted two new videos. 

The first of which briefly goes through my plans and goals with my self-created Artificial Intelligence Master's Degree

I've started a weekly VLOG series documenting my learning journey if you'd like to see anything specific within the videos, be sure to leave a comment or send me an email. 

The second of which is a few clips I recorded whilst working out with two of my close friends at Raw Training Australia.

Over the past few months, I've been experimenting with new styles of movements. I've diversified my daily exercise from being all about lifting weights to incorporating many different styles of movement. In the video above, we had some fun practising the L-sit, it's a surprisingly hard exercise!

I'll be posting more videos like these in the future on a weekly basis. If you think I can improve them in anyway, I'd love your advice!