Eat when you need to.
Drink when you’re thirsty.
Work when you need to work.
Make the thing you’ve been wanting to make.
Play when you can.
Love who needs it.
Ask the question you’re thinking of.
If only life was so simple.
Eat when you need to.
Drink when you’re thirsty.
Work when you need to work.
Make the thing you’ve been wanting to make.
Play when you can.
Love who needs it.
Ask the question you’re thinking of.
If only life was so simple.
Every man should make something every day.
It can be as small as making his bed or as large as making the hall of fame for his chosen craft.
Creation is freedom. Making things is freedom. The freedom of the ideas in your head.
It won’t all happen at once. You have to start it off. And the easiest way to start is with something small.
Make a small change today.
And make another tomorrow.
Eventually, they'll add up and you'll look back at where you started and wonder why you didn't begin sooner.
Still not sure where to start?
Make someones day. That's always a good place to start.
So you’ve got some data and you’re wondering what can be learned from it. Is it numerical or categorical? Does it have high dimensionality or cardinality?
It’s no secret that data is everywhere. But it’s important to recognise not all data is the same. You might have heard the term data cleaning before. And if you haven’t, it’s not too different to regular cleaning.
When you decide it’s time to tidy your house, you put the clothes on the floor away, and move the stuff from the table back to where it should go. You’re bringing order back to a chaotic environment.
The same thing happens with data. When a machine learning engineer starts looking at a dataset, they ask themselves, ‘where should this go?’, ‘what was this supposed to be?’ Just like putting clothes back in the closet, they start moving things around, changing the values of one column and normalising the values of another.
But wait. How do you know what to do to each piece of data?
Back to the house cleaning analogy. If you have a messy kitchen table, how do you know where each of the items goes?
The spices go in the pantry because they need to stay dry. The milk goes back in the fridge because it has to stay cold. And the pile of envelopes you haven’t opened yet can probably go into the study.
Now say you have a messy table of data. One column has numbers in it, the other column has words in it. What could you with each of these?
A convenient way to break this down is into numerical and categorical data.
Before we go further, let’s meet some friends to help unpack these two types of values.
Harold the pig loves numbers. He counts his grains of food every day.
Klipklop the horse watches all the cars go past the field and knows every type there is.
And Sandy the fish loves both. She knows there’s safety in numbers and loves all the different types of marine life under the sea.
Like Harold, computers love numbers.
With any dataset, the goal is often to transform it in a way so all the values are in some kind of numerical state. This way, computers can work out patterns in the numbers by performing large-scale calculations.
In Harold’s case, his data is already in a numerical state. He remembers how many grains of food he’s had every day for the past three years.
He knows on Saturdays he gets a little extra. So he saves some for Mondays when the supply is less.
You don’t necessarily need a computer to figure out this kind of pattern. But what if you were dealing with something more complex?
Like predicting what Company X’s stock price would be tomorrow, based on the value of other similar companies and recent news headlines about Company X?
Ok – so you know the stock prices of Company X and four other similar companies. These values are all numbers. Now you can use a computer to model these pretty easily.
But what if you wanted to incorporate the headline ‘Company X breaks new records, an all-time high!’ into the mix?
Harold is great at counting. But he doesn’t know anything about the different types of grains he has been eating. What if the type of grain influenced how many pieces of grain he received? Just like how a news headline may influence the price of a stock.
The kind of data that doesn’t come in a straightforward numerical form is called categorical data.
Categorical data is any kind of data which isn’t immediately available in numerical form. And it’s typically where you will hear the terms dimensionality and cardinality thrown around.
This is where Klipklop the horse comes in. He watches the cars go past every day and knows the make and model of each one.
But say you wanted to use this information to predict the price of a car.
You know the make and model contribute something to the value. But what exactly?
How do you get a computer to understand that a BMW is different from a Toyota?
This is where the concept of feature encoding comes in. Or in other words, turning a category into a number so that a computer learns how each of the numbers relates.
Let’s say it’s been a quiet day and Klipklop has only seen 3 cars.
A BMW X5, a Toyota Camry and a Toyota Corolla. How could you turn these cars into numbers a machine could understand whilst still keep their inherent differences?
There are many techniques, but we’ll look at two of the most popular – one-hot-encoding and ordinal encoding.
This is where the car and its make are assigned a number in the order they appeared.
Say the BMW went by first, followed by the Camry, then the Corolla.
But does this make sense?
By this logic, a BMW + Toyota should equal a Toyota (1 + 2 = 3). Not really.
Ordinal encodings can be used for some situations like time intervals but it’s probably not the best choice for this case.
One-hot encoding assigns a 1 to every value that applies to each individual car, and 0 to every value that does not apply.
Now our two Toyotas are similar to each other because they both have 1’s for Toyota but differ on their make.
One-hot-encoding works well to encode category values into numbers but has a downfall. Notice how the number of values used to describe a car increased from 2 to 5.
This is where the term high dimensionality gets used. There are now more parameters describing what each car is than there is the number of cars.
For a computer to learn meaningful results, you want the ratio to be high in the opposite way.
In other words, you’d prefer to have 6,000 examples of cars and only 6 ways of describing them rather than the other way round.
But of course, it doesn’t always work out this way. You may end up with 6,000 cars and 1,000 different ways of describing them because Klipklop has seen 500 different types of makes and models.
This is the issue of high cardinality – when you have many different ways of describing something but not many examples of each.
For an ideal price prediction system, you’d want something like 1,000 Toyota Corollas, 1,000 BMW X5s and 1,000 Toyota Camrys.
Ok, enough about cars.
What about our stock price problem? How could you incorporate a news headline into a model?
Again, you could do this a number of ways. But we’ll start with a binary representation.
You were born before the year 2000, true or false?
Let’s say you answered true. You get a 1. Everyone born after the year 2000 gets a 0. This is binary encoding in a nutshell.
For our stock price prediction, let’s break our news headlines into two categories – good and bad. Good headlines get a 1 and bad headlines get a 0.
With this information, we could scan the web, collecting headlines as they come in and feeding these into our model. Eventually, with enough examples, it would start to get a feel of the stock price changes based on the value it received for the headline.
And with the model, you start to notice a trend – every time a bad headline comes out, the stock price goes down. No surprises.
We’ve used a simple example here and binary encodings don’t exactly capture the intensity of a good or bad headline. What about neutral, very good or very bad? This is where our the previously discussed ordinal encoding could come in.
-2 for very bad headlines, -1 for bad, 0 for neutral, 1 for good and 2 for very good. Now it makes sense that very bad + very good = neutral.
There are more complex ways to bring words into a machine learning model but we’ll leave those for a future article.
The important thing to note is that there are many different ways seemingly non-numerical information can be converted into something a computer can understand.
Machine learning engineers and data scientists spend much of their time trying to think like Sandy the fish.
Sandy knows she’ll be safe staying with the other school of fish but she also knows there’s plenty to learn from exploring the unknown.
It’s easy to lean on only numerical information to draw insights from. But there’s so much more information hidden in diverse ways.
By using a combination of numerical and categorical information, more realistic and helpful models of the world can be built.
It’s one thing to model the stock market using price information, but it’s a whole other game when you add news headlines to the mix.
If you’re looking to start harnessing the power of your data with techniques like machine learning and data science, there are a few things you can to get the most of it.
If you’re collecting data, what format is it stored in?
The format itself isn’t necessarily as important as the uniformity. Collect it but make sure it’s all stored in the same way.
This applies for numerical and categorical data, but especially for categorical data.
The ideal dataset has a good balance between cardinality and dimensionality.
In other words, plenty of examples of each particular sample.
Machines aren’t quite as good as humans when it comes to learning (yet). We can see Harold
the pig once and remember what a pig looks like, whereas, a computer needs thousands of examples of a picture of a pig to remember what a pig looks like.
A general rule of a thumb for machine learning is that more (quality) data equals better models.
Document what each piece of information relates to
As more and more data is collected, it’s important to be able to understand what each piece of information relates to.
At Max Kelsen, before any kind of machine learning model is run, the engineers spend plenty of time liaising with subject matter experts who are familiar with the data set.
Because a machine learning engineer may be able to build a model which is 99% accurate but it’s useless if it’s predicting the wrong thing. Or worse, 99% accurate on the wrong data.
Documenting your data well can help prevent these kinds of misfires.
It doesn’t matter whether you’ve got numerical data, categorical data or a combination of both – if you’re looking to get more out of it, Max Kelsen can help.
I’ve been a Machine Learning Engineer for the past 7-months. And of the few projects I’ve worked on, there are a few things which have come up every time.
Of course, it’s never the same because the data is different. But the principles of one problem can often be used for another.
Here are a few things I have to remind myself of every time I start to go to work with a new dataset.
When you first get a dataset, the first thing you should do is go through it and formulate a series of questions.
Don’t look for answers straight away, doing this early could result in a roadblock of your exploratory process.
‘What does this column relate to?’
‘Would this effect that?’
‘Should I find out more about this variable?’
‘Why are these numbers like this?’
‘Do these samples always appear that way?’
You can start to answer them on your second time going over the data. And if they’re questions you can’t quite answer yourself, turn to the experts.
Data is data. It doesn’t lie. It is what it is. But that doesn’t mean some of the conclusions you draw won’t be biased by your own intuition.
Say you’re trying to predict prices of houses. There may be some things you already know. House prices took a dip in 2008, houses with white fences earn more, etc.
But it’s important not to treat these as hard assumptions. Before you start to build the world’s best housing model, you may want to ask some questions of people with experience.
Because it will save you time. After you’ve formulated your question list in Part A, asking a subject matter expert may save you hours of preprocessing.
‘Oh, we don’t use that metric anymore, you can disregard it.’
‘That number you’re looking at actually relates to something else.’
When you start building a model, make sure you have the problem you’re trying to solve mapped out in your head.
This should be discussed with the client, with the project manager and any other major contributors.
What is the ideal outcome of the model?
Of course, the goal may change as you iterate through different options but having something to aim towards is always a good start.
There's nothing worse than spending two weeks building a 99% accurate model and then showing the client your work only to realise you were modelling the wrong thing.
Measure twice. Cut once. Actually, this saying doesn't really work for machine learning because you'll be making plenty of models. But you get the point.
What kind of data is there?
Is it only numerical?
Are there categorical features which could be incorporated into the model?
Heads up, categorical features can be considered any type of data which isn't immediately available in numerical form.
In the problem of trying to predict housing prices, you might have number of bathrooms as a numerical feature and the suburb of the house as a categorical (a non-number category) feature of the data.
There are different ways to deal with both of these.
For numerical the main way is to make sure it's all in the same format. For example, imagine the year of a car was manufactured.
Is 99' (the year 99) four times greater than 18' (the year 2018)?
You might want to change these to 1999 and 2018 to make sure the model captures how close these two numbers actually are.
The goal of categorical features is to turn them into numbers. How could you turn house suburbs into numbers?
Say you had Park Ridge, Greenville and Ascot.
Could you say Park Ridge = 1, Greenville = 2 and Ascot = 3?
But doesn't mean Park Ridge + Greenville = Ascot?
That doesn't make sense.
A better option would be to one-hot-encode them. This means giving a value a 1 for what it is and 0's for what it isn't.
There are many other options to turn categorical variables into numbers and figuring the best way how is the fun part.
Can you create a simpler metric to measure in the beginning?
There might be an ideal scenario you're working towards but is there a simpler model you can put together to test your thinking?
Start with the simplest model possible and gradually add complexity.
Don't be afraid to be wrong, again, again, again and again. Better to be wrong in testing than in production.
If in doubt, run the code. Just like data, code doesn't lie. It'll do exactly as you tell it.
The quicker you figure out what doesn't work, the quicker you can find what does.
The main problems which arise from machine learning projects are often not the data or the model, it's the communication between the parties involved.
Communication is always the key.
Working through a problem can end up in you being stuck down a rabbit hole. You wanted to try one thing, which led to another and another and now you're not even sure what problem you're working on.
This isn't necessarily a bad thing, some of the best solutions are found this way.
But remember not everyone will be able to understand your train of thought. If in doubt, over communicate.
Ask yourself, 'Am I still on the right track?' Because if you can't answer it, how do you think others will go?
My quality of work got an upgrade after I stumbled across these two amazing resources.
A notebook by Daniel Formoso (awesome name) which goes through a data science classification task from start to finish using scikit-learn, TensorFlow and a bunch of other techniques.
CatBoost, a rocket-powered open-source gradient boosting on decision trees library. In other words, an epic algorithm which improved all my results on a recent by 10%.
You could combine these two and build a pretty robust foundation for your next machine learning project.
It may not seem like it now but those extra hours you put into building your skills all added up.
Taking care of your health was also the right thing to do.
And don’t forget to keep reminding those close to you how much you love them.
Keep learning, keep moving, keep loving.
Your 2048 self.
PS the latest episode of Learning Intelligence is out, I’ve been learning all bout the Google Cloud Platform. I’m still a novice when it comes to dev ops but it’s becoming more and more a requirement in my day to day work. So I figured I better start getting on top of it. And make my 2048 self proud.
The first one wasn't as good.
This one isn't much better.
But it's an improvement on the last.
That's all you need.
Positioning, brand image, big idea.
According to Ogilvy, these are three pillars of producing advertising that sells.
The average person is bombarded with advertisements all day, every day. More than they can handle.
How do you get someone to buy your product? Or use your service when there's a sea of options out there?
It used to be if you could be the loudest, you'd win. Now the internet has changed all that. Everyone has the potential for being the loudest.
Your advertisement can be loud and and in their face but it won’t necessarily make the cash register ring. Customers are smarter than ever now. They still don't really know what they want but they do know how well the other guys did it.
The quality of goods and services is going up. And no matter what you’re offering, your customers will compare you to the best experience they've ever had.
I was in Austin with my God Father the other day. And there are two electric scooter companies who have take over the city. We wanted to join in. So we found two scooters, one from Brand 1 and another from Brand 2. I signed up and was rolling down the street before my God Father had even got through the form. User error? Perhaps. But 5-minutes later he still wasn’t signed up, so I tried and I couldn’t get through either. So we ditched Brand 2 and rode Brand 1 for the rest of the trip.
If getting in an Uber takes two taps and your offering takes multiple attempts and a lengthy payment method, who do you think they'll choose?
‘My own definition (of positioning) is what the product does, and who is it for?’
A good advertisement answers these two questions.
Ogilvy gives an example of how he positioned Dove.
‘I could have positioned Dove as a detergent bar for men with dirty hands, but chose instead to position it as a toilet bar for women with dry skin. This is still working 25 years later.’
People don’t buy your product. They buy the story they can tell themselves when they own it.
No one needs the latest iPhone. A phone half the price will do a similar job. But if you own it, you’re one of the people with the latest iPhone. You broadcast to the world, 'Hey, I've got the resources to get the latest iPhone.'
When you position your product or service, what promise are you offering?
‘We save product managers time by...’ Or ‘this software will save you $1000 per week on your stock ordering.’
1. What does your product do?
2. Who is it for?
Remember these two questions when writing your next advertisement.
People love coffee. It’s the second biggest commodity next to oil. 2.2 billion cups are drunk worldwide everyday.
Coffee is cheap. My friend runs a coffee business. He says the average mark up on a cup of coffee is 800-1000%.
So why do you pay $5 for a coffee at your favourite place versus the $1 you could spend at the corner store?
It’s because your favourite place has a brand behind it. You know the owner, Sarah, the beans are sourced from somewhere, ‘there’s a slight hint of caramel in this months blend.’
‘The brand image is 90 per cent of what the distiller has to sell.’ — Ogilvy on the three main whiskey distillers in the US.
There isn’t much difference between one cup of coffee or one bottle of whiskey apart from the image behind it.
Brand image is saying, ‘people like us, do things like this.’
Coca-Cola doesn’t tell you how many coca leaves go into their soda, instead they show you pictures of attractive people drinking Coca-Cola and having the time of their lives.
‘Researchers at the Department of Psychology at the University of California gave distilled water to students. They told some of them that it was distilled water, and asked them to describe its taste. Most said it had no taste of any kind. They told the other students that the distilled water came out of the tap. Most of them said it tasted horrible. The mere mention of tap conjured up an image of chlorine.’
What will people think of when they think of your brand?
Tap water or distilled water?
‘It takes a big idea to attract the attention of customers and get them to buy your product. Unless your advertising contains a big idea, it will pass like a ship in the night.’
But where do you get a big idea?
‘Big ideas come from the unconscious. This is true in art, in science and in advertising. But your unconscious has to be well informed, or your idea will be irrelevant.’
There are no guidelines to getting the muse to show up. But one thing is for sure. For it to arrive, you have to do your homework.
Ogilvy was researching for a Rolls Royce campaign when he comes across the line, ‘At 60 miles per hour, the loudest thing in the Rolls Royce is the electric clock.’
Then it became one of the greatest headlines of all time.
Who’s more likely to think of a big idea?
John: ‘I never do research about the product, I rely on intuition alone.’
Rachel: ‘I spend the first week of any campaign learning everything I can about what I’m creating a campaign for, who uses it, who doesn’t use it, why they use it.’
Ogilvy was considered one of the best creatives of his time but even he felt the difficulties of identifying good ideas.
'It is horribly difficult to recognize a good idea. I shudder to think how many I have rejected. Research can’t help you much, because it cannot predict the cumulative value of an idea, and no idea is big unless it will work for thirty years.'
He then goes on to talk about five questions he uses to help recognise big ideas.
Did it make be gasp when I first saw it?
Do I wish I had thought of it myself?
Is it unique?
Does it fit the strategy to perfection?
Could it be used for 30 years?
Big ideas are hard to come by but by doing your research and asking the right questions, you can give yourself the best chance of finding them.
Life is about selling. Selling your skill set to a job interviewer, selling yourself to a future partner, selling your product or service to customers.
The internet has exposed us to all kinds of offerings and it's no longer enough to build a great product and they will come. Even the greatest products won't sell if you've got no shelf space. And the internet has a lot of shelf space. But how do you use yours best?
Position your offering well. What does it do and who is it for? What can you do and who will benefit from your service?
Create a brand image people will remember. Your personal brand is what you repeatedly do. The same goes for your product or service. Remember, quality is always a favourable trait.
To dream up big ideas, do your research. Become an expert in what you're offering. Amateurs rarely have big ideas.
The next time you want to sell anything. Remind yourself of these pillars.
A true professional shows up no matter what.
Imagine if your heart surgeon decided she didn’t really feel like replacing hearts today right in the middle of your surgery.
’Well, I’m kind of over all this blood, I’m going to a beach.’
But she doesn’t go to the beach. Because she’s a professional.
A professional is not immune to thoughts of doing other things. Nobody is. But a professional deals with them accordingly.
A professional knows there’s a job to do, a mission to complete.
There will be times where you don’t want to do the thing you know you have to do.
And in that moment you a choice. A choice whether to take on the role of the professional.
It’s rare to write a viral article the first time you post.
It’s rare for a video to break the internet.
It’s rare for a product to become a multi-million dollar success.
But these things aren’t seen as rare. Because they’re the things we see all the time.
Your first piece of work may not hit number 1. But it’s better than not being on the scoreboard at all.
The trenches are where the real battles happen, not in the offices of those watching over.
Everyone is capable of making something. Including you.
And if the first one doesn’t work. Try again. And again. Keep going good work. Keep making good art. Better art. You’ve got time.
Someone came up to me today and said they wanted to get healthier.
‘I’ve got a pack with my friend to sign up to the police force,’ he said, ‘and we’ve both let ourselves go a bit.’
‘And you look like the type of guy who knows what he’s doing, and I’ve decided it’s time to do something.’
‘Yeah, of course, I can help,’ I said, ‘ask me anything.’
What he did takes guts.
It’s hard to look at the world and say, ‘I’m going to make a change.’
It’s sometimes even harder to look at yourself and do the same thing.
Changing your mind is free but that doesn’t mean it isn’t hard.
On another note, these kind of situations are why I stay fit.
My body is my product. It’s skin in the game. If I’m going to write, talk about and spread the message of being healthy, I have to first be healthy myself.
You are what you repeatedly do. That’s your personal brand.
The same principle applies. If your personal brand is what you repeatedly do, you can change it like your mind. It’ll be free but no one said it’ll be easy.
Everyone is familiar with the concept of DNA.
One generation passes theirs onto the next.
Evolution slowly but surely worked out the best way to transfer information across generations.
Now we’ve got different methods; books, video, photos. But when it comes to replicating the population, DNA is still King (and Queen).
Chances are if you’ve heard of DNA, you’ve heard of genes.
“How does she look so good?”
“She must have got good some genes!”
But what are ‘good genes’?
From the sounds of things, most people would think you get your genes, they’re good or bad and that’s that.
Well, that’s partly true. You are born with specific genes but they won’t stay 100% the same throughout your life.
Much like how your bank account fluctuates depending on your spending habits and earnings, your genes with fluctuate with your health.
Let’s say you want your kids to get a big inheritance. You work hard and control your spendings.
Eventually, little Johnny gets a good deal of cash after you pass.
Whether this is good or bad is up for debate.
But the other side of the coin to wealth inheritance is health inheritance.
Just like years of poor spending habits will put a dent in anyones bank account, years of poor health habits will damage your genes.
Now you may not notice the effects immediately. Once formed, the human body is a resilient beast.
But your offspring may not be as lucky.
You know you shouldn’t smoke or drink during pregnancy as it can lead to a deformed baby.
But what about eating a diet lacking Vitamin K2, which is crucial to jaw development?
Vitamin K2 is a fat soluble vitamin which means it’s found in fatty foods (especially eggs).
During the past few decades there has been a trend to go against fatty foods.
Which may explain why so many dentists are driving around in BMW’s. The braces business is booming.
Causation or correlation?
More work has to be done but this is just one example of how food can influence future generations.
Eating well and taking care of your health won’t only mean you’ll look good, it’ll give your future offspring the best chance of growing up attractive and healthy.
Inheriting health is far more important than inheriting wealth.
PS If you’re looking to learn more about nutrition and health, I’ve been loving the book Deep Nutrition by Catherine Shanahan, 11/10.
Someone emailed me the other day saying they were having trouble completing the projects they started.
'I have 17 unfinished projects, how do I keep going to the end with them?'
The truth is, I'm the same. I have a list of unfinished projects. A book, a couple of apps, an AI curriculum.
So when I thought about how I should respond, the advice was to myself as much as it was the person on the other end.
If you keep missing your goals, bring the target closer.
Make your goals smaller. Have the big picture in your mind but break it down into sizeable steps.
If you want to write a book, write 500 words per day.
If you want to improve your data science skills, practice 1-4 hours per day.
If you want to build an app, strive to write one line of good code per day.
Some days you’ll do more. But aim for a at least a little each day.
Get a feeling for what it's like to complete something. Something small.
Then as you keep achieving smaller goals, you'll start moving toward your bigger goal.
Lay one perfect brick per day and eventually you'll have a beautiful wall.
Whatever your goals are.
Whatever your ambitions are.
It doesn’t matter.
You can all the drive in the world but you still need gas.
Energy is the most valuable resource you have.
And energy comes from taking care of your health. Mental, physical, spiritual.
Recent advancements in modern medicine have been exciting but they’re still no match for the most established health technology of all time.
Don’t overcomplicate it.
Eat food, real food. If it comes in a packet or through a window, you probably don’t need it.
Sleep long and well. 17+ hours without sleep and your cognitive abilities are the equivalent of a drunk driver.
Move. Often. Ever noticed what happens to a body of water when it stays still? It becomes stagnant.
Mental and spiritual health often come second to physical health. But they’re just as important. Check in with yourself. Check in with others.
We’re all looking to answer the same thing. ‘Where do I fit in this world?’
There’s no right or wrong answer. But that doesn’t mean it isn’t hard to find. You’ll need energy to keep looking.
Take care of your health. It’s your most important asset.
Find a mirror.
Look into it.
And say, ‘I love you, let’s work together.’
It’s unlikely is as hard on you as you are on yourself.
And if there is someone who’s being that harsh on you. You’ve got two options.
A) Remove them from your life.
B) Say, ‘I love you, let’s work together.’