Skip to content

Finding an optimal mixed effects model

parkjsh edited this page Feb 16, 2024 · 12 revisions

After processing your MRI data you may be wondering to yourself, what is the most appropriate statistical model to explain the variance in my data?

Here I will go through the steps necessary to compare models based on best fit of the data. For example, if you have longitudinal data and are unsure how to fit the effect of time, you may want to compare a linear to a quadratic fit (old example). Alternatively, you could compare two models with varying complexity, as in the first example. The preferred method implements the AIC as described first, but the older method is included at the bottom as well.

AIC

Here we compare one model that interacts timepoint (age) and treatment including sex as a covariate with a second model that interacts age, sex, and treatment. While this example is written for mincLmer, it applies mostly the same for vertexLmer or anatLmer.

You will need to load the following libraries in R:

> library(RMINC)
> library(lme4)

First define a function to calculate AIC:

AIC_summary <-
  function(mmod)
    c(RMINC:::fixef_summary(mmod), AIC = extractAIC(mmod)[2])

Then, run both of the models you are interested in comparing.

mlm <-
  mincLmer(jacobians ~ Treatment * age + Sex + (1|ID)
          , data = data
          , REML = F
          , summary_type = AIC_summary)
mlm2 <-
  mincLmer(jacobians ~ Treatment * age * Sex + (1|ID)
          , data = data
          , REML = F
          , summary_type = AIC_summary)

When they have run completely, write the AIC columns to minc files thus:

mincWriteVolume(mlm, “model1_AIC.mnc”, column =AIC’)
mincWriteVolume(mlm2, “model2_AIC.mnc”, column =AIC’)

Then, open a terminal and load minc-toolkit

In your terminal, type:

minccalc -expression 'A[0]<A[1]?1:2' model1_AIC.mnc model2_AIC.mnc winnermap.mnc

In this expression, A[0] is the first .mnc file you enter (model1_AIC.mnc). A[1] is the second (model2_AIC.mnc). This expression will compare A[0] to A[1] at each voxel and create an output mnc (winnermap.mnc) showing at each voxel which model is better. For AIC, a smaller value indicates a better model. If A[0] is less (better) than A[1], that voxel receives a value of 1. If not, that voxel receives a value of 2. Then, we can Display winnermap.

Display winnermap.mnc

winnermap

Areas that are black indicate model1 is a better fit. Areas that are white indicate model2 is a better fit.

In some cases it may be obvious which model is better, but in this example both maps account for a fairly equal amount of the image. Two things can help you decide which model to end up using. First: pay attention to areas that may be subject to motion artefacts, for example dorsal, posterior areas. Second: check if one model is more symmetrical. In this case, I chose model 1.

Running AIC on linear models

If you are simply working with a linear model, and not a linear mixed effects model, you can do the following: First run your two models and store them as model1 and model 2:

model1 = mincLm(rel_jac_overall ~ new_groups*sex,filtered.data, mask=votedmask)
model2 = mincLm(rel_jac_overall ~ new_groups+sex,filtered.data, mask=votedmask)

Then you can run compare_models to compare the two models using AIC:

model_comparison = compare_models(model1, model2, metric=AICc) The output looks as follows, where the first column is the AIC for the first model (for each voxel), and the second column corresponds to the second model:

model_comparison
model_comparison matrix, showing  6  of  24749075  rows, and  2  of  2 columns
print more by supplying n and width to the print function
showing models:
showing models:
1: rel_jac_overall ~ new_groups * sex
2: rel_jac_overall ~ new_groups + sex
         [,1]     [,2]
[1,] 6.189573 8.285714
[2,] 6.189573 8.285714
[3,] 6.189573 8.285714
[4,] 6.189573 8.285714
[5,] 6.189573 8.285714
[6,] 6.189573 8.285714

If you want the AIC map for each model you can store model_comparison[,1] or model_comparison[,2] as minc files: mincWriteVolume(buffer=model_comparison[,1], output.filename = "AIC_model1.mnc", like.filename = "your_nlin3_average.mnc")

Alternatively, if you want a map as the one described for the lmer method, you can run the following line:

winner = apply(compare_models(model1,model2), 1, which.min)

As above, areas in black will be best explained by model 1, and areas in white are best explained by model 2.

Displaying AIC results for a region of interest

If your AIC winnermap looks pretty evenly split between your tested models, it might be challenging to see where exactly your brain region of interest is to decide on your winning model. In this example, I'm interested in choosing a model for the hippocampus only.

Step 1: Select your area of interest using the DSURQE label csv file

Open DSURQE_40micron_R_mapping.csv and make note of the right and left label numbers in the region. For example, I took the labels highlighted in the image below: image

Step 2: make a lookup table with the chosen labels on one column

Label each row as 1 on a separate column to note that these are the regions that are significant in your mask.

image

Step 3: Run minclookup

This command performs a lookup table operation on each voxel of a minc file. Here, I set my input lookup table as the csv file we just created above. This will point the command to look for the voxels specified in the table within the input file. For example:

minclookup -discrete -lookup_table hc_lookup.csv labels_on_average.mnc hc_label_on_average.mnc

I used the option -discrete to specify that my lookup table is discrete. This will make sure that each input value is treated as an integer, and if it is found in the lookup table, then the corresponding value is written to the output file. If it is not found, then a null value is written out (zero by default). For example:

mincmath -mult hc_label_on_average.mnc winnermap_linearvspoly2.mnc hc_winnermap_linearvspoly2.mnc

Final hippocampus label file will look something like this:

image

Step 4: run mincmath

Use this command to multiply each value of each voxel in the lable file we just created with the values in your winnermap. This way, you're basically croppping out all the AIC labels that aren't part of the hippocampus.

mincmath -mult hc_label_on_average.mnc winnermap_linearvspoly2.mnc hc_winnermap_linearvspoly2.mnc

Final output winnermap will look like this, where gray regions indicate model 1 and white regions indicate model 2.

image

Option 2: Running AIC with a different mask

What you can also do instead of making a histogram is just re-running your AIC with a mask that only covers your area of interest, not the whole brain. To do this, you would still need to use minclookup to create your region mask. Once you have your region mask, run AIC and make the winnermap as described above under "AIC". The final winnermap would look like this for the hippocampus:

image

To make your life easier, you can display your winnermap with the region label file so that you can easily tell whether the white regions are those outside of the hippocampus or regions within the hippocampus where the second model won.

image

Old Comparison method (still valid)

You will need to load the following libraries in R:

> library(RMINC)
> library(lme4)

Once you've set up your data frames, you will run your first, most simple model. Here I am modelling a linear age term using a polynomial. When comparing models with different fixed effects, as we will in this example, you should set REML=FALSE. You would use REML=TRUE if you are comparing models with different random effects, or when performing your actual statistical analysis.

mod_lin = mincLmer(abs_jac ~ poly(age,1) + (1|mouse_id), mask=mask, REML=FALSE)

Next, you will estimate your degrees of freedom, and perform a False Discovery Rate (FDR) correction on your model: fdr_mod_lin = mincFDR(mincLmerEstimateDF(mod_lin))

Next, you will run your more complex model. In this case, it I am fitting a quadratic age term followed by the degrees of freedom estimation and FDR correction:

mod_quad = mincLmer(abs_jac ~ poly(age,2) + (1|mouse_id), mask=mask, REML=FALSE) fdr_mod_quad = mincFDR(mincLmerEstimateDF(mod_quad))

Now you are ready to compare models. We will use the log likelihood ratio test. The simpler model goes first, the more complex model goes second:

linvsquad = mincFDR(mincLogLikRatio(mod_lin,mod_quad))

Here is an example result from this comparison. In this case, the quadratic model is significantly better at explaining the variance in my data at q=0.01:

Degrees of Freedom: 3 
FDR Thresholds:
       mod_quad
0.01 11.955936
0.05  8.221689
0.1   6.567381
0.15  5.577953
0.2   4.862967

Next, you have to visualize where in the brain your new model best fits your data. To do so, you will output your model comparison results as a minc file: mincWriteVolume(buffer = linvsquad, column = "mod_quad", output.filename = "model_comp_lin_vs_quad.mnc", like.filename = "second_level-nlin-3.mnc")

Finally, you will use Display to overlay your model_comp_lin_vs_quad.mnc file over your nlin-3.mnc average.

cohort4

Here I have set my q-value threshold to 0.05. Areas in white/color are areas in which the more complex model (i.e. quadratic) better fits the data.

Clone this wiki locally