Author's note: This a high level, introductory piece aimed to introduce the core tenets of Machine Learning. For more technical pieces, please visit our blog.
Machine Learning is a hot topic, but it covers a huge range of applications — and with so many people are talking about it — it can be overwhelming to navigate, especially when you’re just starting out. So we put together this article to introduce you to three of the big schools of Machine Learning (ML) — Supervised Learning, Unsupervised Learning, and Reinforcement Learning. These three are responsible for many of the innovations in Machine Learning in recent years — of which you’ve probably experienced but not realised.
So, what’s the difference?
Supervised Learning, Unsupervised Learning and Reinforcement Learning
The “Learning” component of these three applications (and machine learning more generally), refers to how a computer ‘learns’ to do a task using an algorithm. The key difference between these learning types is how the algorithm achieves the task or gets to the right answer.
Before we continue, a few comments on language used within this piece: “Learning” for an algorithm refers to training the algorithm on a data set until it achieves an acceptable degree of accuracy. What is acceptable will depend on various factors including the end application of the algorithm. For example, if one were training an algorithm to detect cancer, one would want the highest degree of confidence possible, to avoid needlessly stressing patients. On the other hand, if one were to train an algorithm to detect whether a picture contained a jar of jam or a lemon meringue pie, the stakes are much lower.
When you use Supervised Learning (SL) algorithms, the algorithm is trained on labelled data.. This is usually with the help of training sets — data sets that are already labelled with the right answer — which the algorithm uses to figure out how to get this answer. Then, when the algorithm encounters new, unlabelled data, it can use what it’s learnt to label the new data correctly. Or, if it doesn’t get it right, the training set ‘corrects’ it and the algorithm keeps learning. For example, in the case of the jam jar/meringue quandary posed earlier, we would have a data set of 10,000 photos. Half would contain jam, half would contain meringue. They would be tagged as such. We would take 9000 of the photos for training. The algorithm would “look” at each picture (Note — link to “How computer vision works”), and slowly it would be trained to detect each image. Once trained, if it were shown an unlabelled picture (from the 1000 we thoughtfully didn’t train it on), the algorithm would be able to output whether it thought the photo contained either a jar of jam or a lemon meringue pie, and a degree of confidence — expressed as a percentage.
Is it jam or meringue? Perhaps not the most revelatory application of Machine Learning.
Unlike in Supervised Learning, algorithms that use Unsupervised Learning (UL) do not start with labelled data. They are trained on unlabelled data (and it is in this way they are “unsupervised”). In the example of jam jars and meringues, they would be shown the same dataset — this time without the labels. Training is undertaken and the algorithm has to instead look for relationships in the data. By doing this, the algorithm can either sort out the data into groups based on common features or try and find rules that describe as much of the data as it can. Then it’s up to the user to decide whether the algorithm’s answer is useful. If it is, then that’s great! If not, then the user can run a different algorithm to see what kind of answer it comes up with. So, we know what that means for jam that comes in a jar, and pies of the lemon meringue ilk. In a business sense, you could train an algorithm on your customer data and find relationships that you didn’t know existed and this could lead to new opportunities for you. A famous example of this was Netflix realising that geography and age was not an accurate indicator of taste from person to person. Instead, if they clustered audience by what they had watched, then recommended other shows viewers of the same cluster had watched, they could greatly improve the accuracy of their recommendation system.
If you’ve followed me thus far, I’m excited to tell you that Reinforcement Learning is completely different to Supervised and Unsupervised Learning.
When it comes to Reinforcement Learning (RL), RL algorithms (also called agents) are acting upon an environment. They take action/s that influence the environment they’re in and can be rewarded or penalised based on these decisions. The ultimate goal of this is for the algorithm to achieve a particular task while maximising the reward that they get.
Training takes longer, and often take place in simulation instead of the real world, but the results tend to be better, and much closer to actual intelligence than the previous schools. Most of the self-driving car technology you have seen is run in this way.
Want more information and applications?
In Supervised Learning, the algorithm is able to map an input (some data) onto an output (a label). There are two categories that the algorithm can be labelled as, depending on what kind of output it produces: classification and regression. As you might’ve guessed, classification algorithms produce outputs that belong to a particular category (which it would’ve been given in the training set). On the other hand, the output that you get from a regression algorithm is a real value, such as dollars or weight. For this reason, you’ll find that Supervised Learning is used in classification problems — think everything from image recognition to fake news detection — and regression tasks, such as creating forecasts or predictions.
With Unsupervised Learning, we already know that the algorithm looks for relationships in the data to produce an output (a label). And, just like SL, there are two ways that the algorithm can do this: clustering or association. In clustering, the data is separated into groups based on similar features, whereas association rules sort data by looking at how different data points relate and depend on each other (or, how they associate). As mentioned above, because it looks at the features of the data, Unsupervised Learning is often used in customer analysis — where customers that behave in similar ways will be grouped together — as well as fraud detection and recommendation systems.
Turning now to RL, the way this application functions is pretty different from SL and UL. In order for RL to work, the user needs to set up an environment and a system of rewards and penalties — as well as some rules — that the algorithm needs to work within. The more complicated the problem, the more complicated the environment/simulation. Once this is in place, the algorithm can start exploring the environment — which is made up of a series of states for each point in time that it exists for — and interact with the environment by making a decision or taking an action.
These choices are guided by different strategies (or policies) which enable the algorithm to generate an output with a maximised reward. And, like SL and UL, there are two ways that RL can create an output: either through positive reinforcement or negative reinforcement. For this reason, RL is often used in complex environments — think everything from helping robots learn new tasks and optimising chemical reactions to working in simulations to reduce congestion or work out the kinks in self-driving cars. Plus, you can look at even more complex problems by introducing multiple agents into your environment (also called Multi Agent Reinforcement Learning).
With all of this information in mind — which is quite a lot — it’s important to consider the pros and cons that come with each of these different applications.
The ups and downs
Supervised Learning is often praised for its (relative) simplicity and the fact that it can achieve high accuracy and reliability. SL is also useful for finding links between input and output data and for optimising performance criteria. But, it does require a lot of work from the user, particularly in the creation of the training set (which can also be laborious and/or expensive to get your hands on).
Unsupervised Learning (and Reinforcement Learning) doesn’t need labelled data, in some ways it can be a lot easier to undertake. Plus, you can use UL for more complex tasks than SL and it can learn in real-time. However, with independent learning comes a decrease in accuracy and reliability of your outputs when compared to SL. And, because you won’t know how the data is going to be categorised until you get your output, you’ll need to figure out if that particular arrangement of data is useful or if it needs to be reworked.
Finally, while RL has some of the same benefits as UL — like data easily obtainable data and its usefulness for complex environments — it’s also preferred for getting long-term results that can be hard to otherwise obtain. The effort/knowledge required to start is much higher, but the results can be significantly better.. But, one of the trickier elements of RL is creating a simulation environment that’s realistic for the real world application you are seeking to solve. This is pretty important where RL needs to transition to real-world applications — such as self-driving cars.
While there’s a lot to understand here if it’s your first exposure to these concepts, Supervised, Unsupervised and Reinforcement Learning are the cornerstones of machine learning and achieve all kinds of tasks — many of which you aren’t even aware of. Whether it’s being able to tell the difference between pictures of cats and dogs, recommend products, or reduce road congestion, ML is having a huge impact on the world.
Who are we?
Remi AI is an Artificial Intelligence Research Firm with offices in Sydney and San Francisco. We have delivered inventory and supply chain projects across FMCG, automotive, industrial and corporate supply and more.
Want to know more? Sign up to our newsletter for the latest information from us and other knowledgeable folk in the market.