0:00

In the previous video,

Â you saw how looking at training error and depth error can help you

Â diagnose whether your algorithm has a bias or a variance problem, or maybe both.

Â It turns out that this information that lets you much more

Â systematically using what they call a basic

Â recipe for machine learning and lets you much more systematically

Â go about improving your algorithms' performance. Let's take a look.

Â When training a neural network,

Â here's a basic recipe I will use.

Â After having trained an initial model,

Â I will first ask,

Â does your algorithm have high bias?

Â And so to try and evaluate if there is high bias,

Â you should look at, really,

Â the training set or the training data performance.

Â Right. And so, if it does have high bias,

Â does not even fit in the training set that well,

Â some things you could try would be to try pick a network,

Â such as more hidden layers or more hidden units,

Â or you could train it longer.

Â Maybe run trains longer or try some more advanced optimization algorithms,

Â which we'll talk about later in this course.

Â Or you can also try,

Â this is kind of a, maybe it work, maybe it won't.

Â But we'll see later that there are a lot of different neural network architectures

Â and maybe you can find a new network architecture that's better suited for this problem.

Â Putting this in parentheses because one of those things that,

Â you just have to try.

Â Maybe you can make it work, maybe not.

Â Whereas getting a bigger network almost always helps.

Â And training longer doesn't always help,

Â but it certainly never hurts.

Â So when training a learning algorithm,

Â I would try these things until I can at least get rid of the bias problems,

Â as in go back after I've tried this and keep doing that until I can fit,

Â at least, fit the training set pretty well.

Â And usually if you have a big enough network,

Â you should usually be able to fit the training data well so long

Â as it's a problem that is possible for someone to do, alright?

Â If the image is very blurry,

Â it may be impossible to fit it.

Â But if at least a human can do well on the task,

Â if you think base error is not too high,

Â then by training a big enough network you should be able to,

Â hopefully, do well, at least on the training set.

Â To at least fit or overfit the training set.

Â Once you reduce bias to acceptable amounts then ask,

Â do you have a variance problem?

Â And so to evaluate that I would look at dev set performance.

Â Are you able to generalize from a pretty good training

Â set performance to having a pretty good dev set performance?

Â And if you have high variance, well,

Â best way to solve a high variance problem is to get more data.

Â If you can get it this,

Â you know, can only help.

Â But sometimes you can't get more data.

Â Or you could try regularization,

Â which we'll talk about in the next video,

Â to try to reduce overfitting.

Â And then also, again, sometimes you just have to try it.

Â But if you can find a more appropriate neural network architecture,

Â sometimes that can reduce your variance problem as well,

Â as well as reduce your bias problem. But how to do that?

Â It's harder to be totally systematic how you do that.

Â But so I try these things and I kind of keep going back,

Â until hopefully you find something with both low bias and low variance,

Â whereupon you would be done.

Â So a couple of points to notice.

Â First is that, depending on whether you have high bias or high variance,

Â the set of things you should try could be quite different.

Â So I'll usually use the training dev set to try to

Â diagnose if you have a bias or variance problem,

Â and then use that to select the appropriate subset of things to try.

Â So for example, if you actually have a high bias problem,

Â getting more training data is actually not going to help.

Â Or at least it's not the most efficient thing to do.

Â So being clear on how much of a bias problem or variance problem or

Â both can help you focus on selecting the most useful things to try.

Â Second, in the earlier era of machine learning,

Â there used to be a lot of discussion on what is called the bias variance tradeoff.

Â And the reason for that was that,

Â for a lot of the things you could try,

Â you could increase bias and reduce variance,

Â or reduce bias and increase variance.

Â But back in the pre-deep learning era,

Â we didn't have many tools,

Â we didn't have as many tools that just reduce

Â bias or that just reduce variance without hurting the other one.

Â But in the modern deep learning, big data era,

Â so long as you can keep training a bigger network,

Â and so long as you can keep getting more data,

Â which isn't always the case for either of these,

Â but if that's the case,

Â then getting a bigger network almost always just

Â reduces your bias without necessarily hurting your variance,

Â so long as you regularize appropriately.

Â And getting more data pretty much always

Â reduces your variance and doesn't hurt your bias much.

Â So what's really happened is that,

Â with these two steps,

Â the ability to train, pick a network,

Â or get more data,

Â we now have tools to drive down bias and just drive down bias,

Â or drive down variance and just drive down variance,

Â without really hurting the other thing that much.

Â And I think this has been one of the big reasons

Â that deep learning has been so useful for supervised learning,

Â that there's much less of this tradeoff where you

Â have to carefully balance bias and variance,

Â but sometimes you just have more options for reducing bias

Â or reducing variance without necessarily increasing the other one.

Â And, in fact, [inaudible] you have a well regularized network.

Â We'll talk about regularization starting from the next video.

Â Training a bigger network almost never hurts.

Â And the main cost of training a neural network that's too big is just computational time,

Â so long as you're regularizing.

Â So I hope this gives you a sense of the basic structure of how to

Â organize your machine learning problem to diagnose bias and variance,

Â and then try to select the right operation for you to make progress on your problem.

Â One of the things I mentioned several times in the video is regularization,

Â is a very useful technique for reducing variance.

Â There is a little bit of a bias variance tradeoff when you use regularization.

Â It might increase the bias a little bit,

Â although often not too much if you have a huge enough network.

Â But let's dive into more details in the next video so you can

Â better understand how to apply regularization to your neural network.

Â