This course covers the design, acquisition, and analysis of Functional Magnetic Resonance Imaging (fMRI) data. A book related to the class can be found here: https://leanpub.com/principlesoffmri

Loading...

From the course by Johns Hopkins University

Principles of fMRI 1

341 ratings

This course covers the design, acquisition, and analysis of Functional Magnetic Resonance Imaging (fMRI) data. A book related to the class can be found here: https://leanpub.com/principlesoffmri

From the lesson

Week 3

This week we will discuss the General Linear Model (GLM).

- Martin Lindquist, PhD, MScProfessor, Biostatistics

Bloomberg School of Public Health | Johns Hopkins University - Tor WagerPhD

Department of Psychology and Neuroscience, The Institute of Cognitive Science | University of Colorado at Boulder

Hi.

In this module we're going to be talking about noise models that can be included

in the GLM.

So, again to recap, a standard GLM is written in the following format.

We have Y, which is the fMRI data, X, which is the design matrix, beta,

which is the regression coefficients, and epsilon, which is the noise.

So far in the past couple of modules, we've been talking about model building

and how to get the best possible design matrix X.

So now what we're going to do is we're going to focus more

on the variance-covariance matrix V.

So in the last module, we kind of assumed that V was an identity matrix,

and that we had IID noise, but in practice in fMRI data analyses,

that's typically not true, and we have a sort of autocorrelation between

adjacent time points, and so we have to take that into consideration.

And so, again, this autocorrelation is typically caused by physiological

noise and low frequency drift which hasn't been appropriately modeled.

And so typically an fMRI, we use either an autoregressive model,

a process of order P, or an ARMA(1,1),

autoregressive moving average model of order (1,1), to model the noise.

And so single-subject statistics are not going to be valid without

an accurate model of the noise, so it's quite important.

So let's focus on one particular type of noise model and

how it's included in the GLM analysis.

We're focusing here mainly on the AR(1) model, so autoregressive model of order 1.

So serial correlation is often modeled using such a first-order autoregressive

model, and here we have epsilon t is equal to phi times epsilon t-1 plus u of t,

while u of t is simply a normal with mean 0 and variance sigma squared.

So here, the error term epsilon t depends on the previous error term epsilon t-1 and

a new disturbance term, which is given by this u of t.

So this kind of looks like a regression model,

and that's kind of where it comes from, autoregressive, it's self regressing.

So basically it's regressing on its lagged value here.

So basically, if we look at the autocorrelation function for

an AR(1) process, we see that the autocorrelation between adjacent time

points depends on how close different time points are to one another.

So basically, the autocorrelation is equal to 1 if the lag is 0.

So it's of course perfectly correlated with itself.

However, if we have a lag of one time point,

the auto-correlation is now equal to phi, which is a constant of the model.

And then it decays according to phi to the power of h.

So what we see here is phi is equal to 0.7, we'll see that adjacent time points,

directly adjacent time points, will have an autocorrelation of 0.7.

If the time points are removed, are two points away from each other,

the autocorrelation will be 0.7 squared, so 0.49, and

the autocorrelation decays as we move farther and farther away from each other.

So again, the format of V which we include and which is important in the estimation

of the GLM model, it will depend on the noise model that we use.

So in the IID case, V is just equal to identity and

everything is nice and simple.

However, in the AR(1) case we have to incorporate this autocorrelation between,

depending on how close time points are to each other.

So here V will look like the following case, so it will look a little bit more

complicated and it will also be a bit more complicated to estimate because

now we also have to estimate phi, which is this autocorrelation component.

So, how does this fit into the GLM estimation that we

talked about in the last module?

Well, again, so this is the GLM model.

And this is the estimate.

So, basically we need to incorporate V, this autocorrelation,

this variance-covariance matrix of this format, into the estimate here.

And then using that, we can get the residuals, and

use the residuals to estimate things about the variance-covariance matrix.

So this is a tricky thing.

So this is what we would do if we knew the form.

If we knew what V was exactly, if we knew what all the components of V were,

we could just estimate beta as in the previous slide.

But in general, the form of the variance-covariance matrix is unknown,

which means that it has to be estimated.

This creates sort of a chicken and egg problem,

because estimating V depends on the betas, and estimating the betas depends on V.

So in order to kind of solve this chicken and egg problem,

we need an iterative procedure.

So we begin by assuming a value of V, estimating beta,

and then updating our estimate of V, and then iterating between the two.

So, we need, therefore, methods for estimating the variance components,

and there are many such methods that we can use in practice.

These include methods of moments, maximum likelihood methods, and

restricted maximum likelihood methods.

So here's how one such iterative procedure might work.

So, we begin by assuming that V, the variance-covariance matrix,

is just equal to identity.

So, we assume that the data are uncorrelated with each other.

And we just simply calculate the ordinary least-squares solution.

Doing that, we can now estimate the parameters of V

using the residuals that we got from the OLS.

And then we can re-estimate the beta value using the estimated covariance matrix

V hat, which we obtained in step 2.

Finally, we would iterate this procedure, estimating V hat and

beta hat and alternating between the two until we find some sort of convergence.

So that's sort of a standard way of doing this type of thing.

So incorporating the autocorrelation makes things a little bit more

complicated when we assume the data was IID, like we did in the previous module.

So how do we estimate in an AR model?

Well in the AR model, there's a very simple method of moment style estimator,

which are called the Yule-Walker estimates.

So basically if we have this AR(1) model,

there is a simple closed-form solution for estimating phi and sigma squared,

and they're given by the following which was based on the autocovariance function.

So we just simply have to calculate the autocovariance function at lag 0 and

1 and plug this in in order to estimate the components of the AR model.

So this is a sort of simple way to do it.

It's fast and it's very useful in fMRI data analysis.

The maximum likelihood type methods are a little bit more

computationally cumbersome.

And they are obtained by maximizing the log-likelihood, and

this is an example of the log-likelihood.

And so we just simply have to estimate the parameters associated with V by maximizing

this function, and there's many different types of maximization methods you can

use to estimate this.

Restricted maximum likelihood is similar to maximum likelihood,

it's just that we simply have an addition ReML term, or

restricted maximum likelihood of variance term.

So basically, it's the same sort of procedure, we need to maximize this.

We need to find the value of V that make this term as big as possible.

So what's the difference between maximum likelihood and

restricted maximum likelihood?

Well, maximum likelihood maximizes the likelihood of the data, Y, and

is typically used to estimate mean parameters, such as beta.

However, it can produce biased estimates of the variance.

So in the ANOVA setting the estimate

would be 1 over n times the sum of squared differences between the mean.

Restricted maximum likelihood, on the other hand, maximizes the likelihood

of the residuals, so, this can be used to estimate variance parameters.

And they're very useful because they provide unbiased

estimate of the various parameters.

So, if you see, for example, in the ANOVA case, what we would have is we would have,

instead of dividing by n, we would divide by 1 over n-1.

And this provides an unbiased estimate of the variance components.

So, that's one of the benefits of restricted maximum likelihood.

So, many people tend to like to use restricted maximum likelihood because

they tend to give unbiased estimates of the variance components.

So further, if we were to fit such an AR model across then whole brain,

in this case I'm fitting an AR(2) model, so I have two phis.

Phi 1 is for the point that's one lag removed and phi 2 is for two lags removed.

But otherwise it's similar to the AR(1) model that we talked about.

But the reason I'm showing this slide is I'm showing the parameters of the model,

phi 1 and phi 2 and sigma, actually vary across the brain.

So this implies that the same autocorrelation model is not going to

hold true across the entire brain.

So for example, you'll see that the autocorrelation seems to vary

depending on, say, tissue type and where in the brain you're located.

So that therefore we have to kind of fit a separate autocorrelation function for

every voxel of the brain,

and this can become quite computationally cumbersome in the long run.

And it also complicates analysis somewhat.

Okay, so this is the end of this module.

We've talked about different noise models and

how they can be incorporated into our GLM estimate.

Okay, in the next module we'll talk a little bit about performing inference

with the GLM model.

I'll see you then, bye.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.