0:00

In this video, I'm gonna talk about three different types of machine learning:

Â supervised learning, reinforcement learning and unsupervised learning.

Â Broadly speaking, the first half of the course will be about supervised learning.

Â The second half of the course will be mainly about unsupervised learning, and

Â reinforcement learning will not be covered in the course, because we can't cover

Â everything. Learning can be divided into three broad

Â groups of algorithms. In supervised learning, you're trying to

Â predict an output when given an input vector, so it's very clear what the point

Â of supervised learning is. In reinforcement level, you're trying to

Â select actions or sequences of actions to maximize the rewards you get, and the

Â rewards may only occur occasionally. In unsupervised learning you're trying to

Â discover a good internal representation of the input and we'll come later to what

Â that might mean. Supervised learning itself comes in two

Â different flavors. In regression, the target output is a real

Â number or a whole vector of real numbers, such as the price of a stock in six months

Â time, or the temperature at noon tomorrow. And the aim is to get as close as you can

Â to the correct real number. In classification, the target output is a

Â class label. The simplest case is a choice between one

Â and zero. Between positive and negative cases.

Â But obviously, we can have multiple alternative labels as when we're

Â classifying handwritten digits. Supervised learning works by initially

Â selecting a model class, that is, a whole set of models that we're prepared to

Â consider as candidates. You can think of a model class as a

Â function that takes an input vector and some parameters and gives you an output y.

Â So a model class is simply a way of mapping.

Â An input to an output using some numerical parameters W and then we adjust these

Â numerical parameters to make the mapping fit the supervised training data.

Â What we mean by fit is minimizing the discrepancy between the target output on

Â each training case and the actual output produced by a machine learning system.

Â And an obvious measure of that discrepancy, if we're using real values as

Â outputs, is the square difference between the output from our system y, and the

Â correct output t, and we put in that one-half, so it cancels the two when we

Â differentiate. For classification you could use that

Â measure, but there's other more sensiblbe measures which we'll come to later, and

Â these more sensibile measures typically work better as well.

Â In reinforcement learning, the outputs an actual sequence of actions, and you have

Â to decide on those actions based on occasional rewards.

Â The goal in selecting each action is to maximize the expected sum of the future

Â rewards, and we typically use a discount factor so that you don't have to look too

Â far in the future. We say that rewards far in the future

Â don't count for as much as rewards that you get fairly quickly.

Â Reinforcement learning is difficult. It's difficult because the rewards are

Â typically delayed, so it's hard to know exactly which action was the wrong one in

Â a long sequence of actions. It's also difficult because a scalar

Â award, especially one that only occurs occasionally, does not supply much

Â information, on which to base the changes in parameters.

Â So typically, you can't learn millions of parameters using reinforcement learning.

Â Whereas supervised learning and unsupervised learning, you can.

Â Typically, in reinforcement learning, you're trying to learn dozens of

Â parameters or maybe 1,000 parameters, but not millions.

Â In this course, we can't cover everything, and so we're not going to cover

Â reinforcement learning, even though it's an important topic.

Â Unsupervised learning, is going to be covered in the second half of the course.

Â For about 40 years, the machine learning community basically ignored unsupervised

Â learning except for one very limited form called clustering.

Â In fact, they used definitions of machine learning that excluded it.

Â So they defined machine learning, in some textbooks, as mapping from inputs to

Â outputs. And many researchers thought that

Â clustering was the only form of unsupervised learning.

Â One reason for this is that it's hard to say what the aim of unsupervised learning

Â is. One major aim is to get an internal

Â representation of the input, that is useful for subsequent supervised or

Â reinforcement learning. And the reason we might want to do that in

Â two stages, is we don't want to use, for example, the payoffs from reinforcement

Â learning, in order to set the parameters, for our visual system.

Â So you can compute the distance to a surface by using the disparity between the

Â images you get in your two eyes. But you don't want to learn to do that

Â computation of distance by repeatedly stubbing your toe and adjusting the

Â parameters in your visual system every time you stub your toe.

Â That would involve stubbing your toe a very large number of times and there's

Â much better ways to learn to fuse two images based purely on the information in

Â the inputs. Other goals for unsupervised learning are

Â to provide compact, low dimensional representations of the input.

Â So, high-dimensional inputs like images, typically, live on or near a

Â low-dimensional manifold. Or several such manifolds in the case of

Â the handwritten digits. What that means is, even if you have a

Â million pixels, there aren't really a million degrees of freedom in what can

Â happen. There may only be a few hundred degrees of

Â freedom in what can happen. So what we want to do is move from a

Â million pixels to a representation of those few hundred degrees of freedom which

Â will be according to saying where we are on a manifold.

Â Also we need to know which manifold we're on.

Â A very limited form of this is principle commands analysis which is linear.

Â It assumes that there's one manifold, and the manifold is a plane in the high

Â dimensional space. Another definition of unsupervised

Â learning, or another goal for unsupervised learning, is to prov-, to provide an

Â economical representation for the input in terms of learned features.

Â If, for example, we can represent the input in terms of binary features, that's

Â typically economical cuz then it takes only one bit to say the state of a binary

Â feature. Alternatively we could use a large number

Â of real valued features but insist that for each input almost all of those

Â features are exactly zero. In that case for each input we only need

Â to represent a few real numbers and that's economical.

Â As I mentioned before, another definition of unsupervised learning or another goal

Â of unsupervised learning is to find clusters in the input, and clustering

Â could be viewed as a very sparse code, that is we have one feature per cluster

Â and we insist that all the features except one are zero and that one feature has a

Â value of one. So clustering is really just an extreme

Â case of finding sparse features.

Â