A Beginner’s Guide to the Bayesian Neural Network

Written by Coursera Staff • Updated on Mar 26, 2024

Learn about neural networks, an exciting topic area within machine learning. Plus, explore what makes Bayesian neural networks different from traditional models and which situations require this approach.

[Featured image] Two colleagues sit at a table discussing the advantages of Bayesian neural networks.

Neural networks are a concept within machine learning centered around helping machines make decisions in a human-like way. However, like humans, you won’t find just one design for thinking through a decision—you will find an infinite number of approaches, each with its advantages and limitations. One notable type of decision-making design is the Bayesian neural network, designed to output informed predictions based on historical data.

In this article, we will explore what neural networks are in the context of machine learning, what the Bayesian neural network is, and when you might benefit from using this model.

What are neural networks in machine learning?

Neural networks in machine learning are a set of algorithms designed to function similarly to neural pathways within the human brain. The networks work to recognize patterns and “perceive” and interpret sensory data. These algorithms are a key foundation of machine learning, enabling computers to learn how to perform tasks like image processing, behavior pattern recognition, and prediction generation.

How do neural networks work?

A neural network is a network of layered nodes or “neurons.” Each node represents a mathematical operation. These networks contain three types of layers: the input layer, which first receives the information, the hidden layers, where the network completes the computations, and the output layer, which outputs the final product of the algorithm.

When you feed data into a neural network, the input layer receives and categorizes the information before passing it to the hidden layers. The hidden layers progressively analyze the information, and the output layer produces the result in the specified format or formats. The weights and biases of the inputs decide the output.

To learn how to accurately produce the outputs, neural networks have a training process. The network makes predictions on the data, compares it against the actual results, and adjusts its weights and biases to minimize errors. The model repeats this process several times, and the network gets progressively better at producing accurate results. One training neural network is backpropagation. Based on how the output compares to the actual values, the algorithm can retrace its steps through the network and adjust weights as needed.

What is a Bayesian neural network?

The Bayesian neural network (BNN) model is an extension of a traditional neural network model. Each weight is a distribution rather than a single number. This means each neuron considers a range of values for each input, adding a layer of probabilistic reasoning.

The major difference lies in how BNNs handle weights and biases. Instead of fixed values, BNNs use probability distributions, allowing them to express uncertainty and continuously update their beliefs. This allows the algorithm to assess the probability of each prediction and provide greater insight into the accuracy of the output.

The probability distributions in BNNs allow them to learn from the data and understand the confidence in what they have learned. This makes BNNs more robust and flexible, especially when you don’t have a high volume of data or the data has noise.

How is the probability calculated?

In Bayesian statistics, you start with a prior belief or assumption about a parameter, known as the prior distribution. It is your initial belief about the parameter before seeing the data. When new data comes in, you use the likelihood of observing this new data to form the “posterior distribution.” This is the updated belief about the parameter after assessing the new data. It combines the prior and the likelihood

Computing the exact posterior distribution can be very complex or sometimes impossible, especially with large datasets or complicated models. Approximation provides a practical solution to this problem.

You can choose between several methods for approximations. Two popular methods include the following:

Markov Chain Monte Carlo (MCMC): This technique samples randomly from the posterior distribution. The random sample is thought to represent the distribution as a whole.

Variational inference: This method approximates the posterior by finding an estimated distribution that is close to the true posterior. This is typically faster than MCMC.

When should you use a BNN?

Bayesian models assess the likelihood of different values, and these values can change over time, unlike traditional algorithms, which produce the same results each time. Because of this, certain applications may benefit from using this model over other types of neural networks. Some scenarios where you may find that BNNs are the right choice for you include:

If your task requires making predictions and understanding how confident those predictions are: BNNs provide not only the most likely outcome but also a measure of certainty around that outcome.

If you have a small or noisy data set: In cases where the available data is limited or noisy, traditional neural networks might overfit and make unreliable predictions. BNNs, with their probabilistic approach, can better handle such data by effectively capturing the underlying uncertainty. While often too computationally intensive for use with big data sets, smaller data sets with more noise or lower fidelity work well with a BNN training model.

If you are dealing with safety concerns: In fields like health care, population health, finance, or autonomous vehicle technology, where incorrect predictions can have serious consequences, BNNs are valuable. You have information about each prediction's likelihood, allowing for a more cautious approach. For example, you may want to know what the likelihood of a patient outcome or health concern is rather than just a binary prediction.

Advantages and limitations of BNNs

BNNs come with unique advantages, particularly in scenarios where traditional neural networks might fall short. Some benefits of BNNs include the following:

Can train with less training data than traditional models

Provide a measure of model uncertainty so stakeholders can make more informed decisions

BNNs are aware of limitations of prediction accuracy, allowing for more informed decision-making

However, BNNs have limitations, which may lead to challenges depending on your needs. One limitation is that BNNs use prior distributions rather than parameters, which limits the ability to incorporate functional beliefs. BNNs can also be difficult to scale, more complex, and computationally intensive than traditional models, and they may have unknown reasons behind why certain predictions are more accurate than others.

Learn more with Coursera.

With the Deep Learning Specialization, you can keep exploring the changing world of deep learning on the Coursera learning platform. This self-guided Specialization takes around three months to complete. It covers compelling topics within deep learning, including neural networks, sequence models, and structuring techniques for machine learning models.

Keep reading

Updated on Mar 26, 2024

Written by:

Coursera Staff

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.