Fit a mini-meta-analysis like three level model

Dear experts,

I am trying to use brms to fit a three-level model to combine results from different experiments. Essentially, I hope to model the data like a mini-meta-analysis, i.e., a meta-analysis of my own experiments.

I had three experiments, with sample size varied from 20 to 50s. All three experiments shared a same 2 * 2 * 3 within-subject design, the dependent variable are reaction times and d’ in signal detection theory.

I tried to model the reaction times data by shifted lognormal distribution, as this blogpost have suggested. However, because this blogpost only provided example script for two-level data, I also read this tutorial and learnt form it.

My code is as below

m1 <- df_test %>%
  dplyr::mutate(RT_sec = RT/1000) %>%    # log RT in seconds
  dplyr::filter(ACC == 1) %>%                      # only correct trials
  brms::brm(RT_sec ~ ismatch*Identity*Valence + 
              (ismatch*Identity*Valence | ExpID) +   
              (ismatch*Identity*Valence | ExpID:Subject),
            family=shifted_lognormal(),
            data = .,
            chains = 4,
            control = list(adapt_delta = .95),
            iter = 4000,
            thin = 2,
            cores = parallel::detectCores(),
            backend = 'cmdstanr')

In which the ExpID is the id of my experiments, Subject is the id of participants. ismatch, Identity, and Valence are three columns for the independent variables.

here is the test dataset: df_test.csv (1.3 MB)

After fitting the model (long waiting time), there were warnings:

Warning messages:
1: Parts of the model have not converged (some Rhats are > 1.05). Be careful when analysing the results! We recommend running more iterations and/or setting stronger priors. 
2: There were 59 divergent transitions after warmup. Increasing adapt_delta above  may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 

Also, the posterior of the population-level parameters are more dispersed than the individual experiments (see below). This is odd because, if the three-level model worked like mini-meta-analysis, the population level parameters were estimated based on overall sample size and therefore should be more precise than those of individual experiments.

I suspect I used a wrong model, which cause both divergence problem and the dispersed posterior of population parameter.

One potential reason is that the distribution for the third level (experiment level) should be different from the second level (individual level), but I donot know how to specify different distributions for different levels.

  • Operating System:

version R version 4.0.4 (2021-02-15)
os Ubuntu 20.04.2 LTS
system x86_64, linux-gnu
ui RStudio
language (EN)

  • brms Version: 2.14.4

Looking forward to suggestions.

Thanks in advance.

1 Like

Disclaimer: I never got a 3-level model to sample properly on my data, so I am far from an expert.

I just think that 3-level models are hard for the MCMC-algorithm. Since your chains didn’t converge, there was definitely something going wrong. I think the first step should be to introduce some (narrower) priors an see if things improve.

I don’t think so. Your model assumes that the experimental level parameters are sampled from a population. I can be very sure about the single instance that I looked at, but very unsure what that tells me about the distribution it was sampled from. Since you only sampled from the distribution 3 times (3 experiments), it is to be expected to have high uncertainty about the underlying distribution.

I don’t know what you mean here, can you explain?

2 Likes

Hi, @BrittaL

Thanks a lot for your response!

Good point, I agree.

It’s a bit tricky to me, I will try my best to explain it. Maybe I am not using the standard Bayesian terms, please bear with me.

Let me be clear about the levels I am referring there:

  • Level 1: trials (each response subjects made)
  • Level 2: subjects
  • Level 3: Experiments
  • Population

If we only have one subject’s data and assume shifted lognormal is the right distribution of the RT, then we can estimate three parameters of shifted_lognormal: ndt, mu, sigma.

Usually, we model data from multiple subjects in one experiment, so we need to extend the model to a two-level model and add subject as the varying/random effect (both intercept and slope in my case).

When I hope to model multiple experiments together, we have an additional level. The population parameters are the parameters of parameters of two-level model (i.e., ndt, mu, sigma). In this case, it’s better to use distribution other than shifted lognormal distriubtion, because it’s not reasonable to assume that ndt, mu, and sigma are distributed as lognormal.

Does it make sense?

1 Like

Ah, I understand what you mean now. In brms, for the random effects, the parameters are assumed to be distributed normally around the common mean, no matter what distribution you assume for your data. (see e.g. this paper, page 3: brms: An R Package for Bayesian Multilevel Models Using Stan | Journal of Statistical Software).

Your model is very complex (lots of interactions and random effects for ALL of those), so I am not surprised that the MCMC sampling algorithm has problems. I think reducing your model and/or specifying more restrictive priors is probably the way to go.

3 Likes

thanks for pointing this out!

Will try this out and see how it goes. Will update this post when I get new results.

1 Like

Hi, @BrittaL

I tried a simpler model and longer chain (see below), but still have divergence issue. I think the model can not be simpler because I also wish to get the posterior of each experiment.

Do you have any suggestions?

Thanks a lot.

m2 <- df_test %>%
  dplyr::mutate(RT_sec = RT/1000) %>%    # log RT in seconds
  dplyr::filter(ACC == 1) %>%                      # only correct trials
  brms::brm(RT_sec ~ ismatch*Identity*Valence + 
              (ismatch*Identity*Valence | ExpID) +   
              (1 | ExpID:Subject),
            family=shifted_lognormal(),
            data = .,
            chains = 4,
            control = list(adapt_delta = .98),
            iter = 8000,
            thin = 2,
            cores = parallel::detectCores(),
            threads = threading(2),
            backend = 'cmdstanr')

My suggestion is this:

2 Likes

Hi both,

I agree with @BrittaL that you may try to use some narrower priors or weakly informative priors first. If I understand correctly, the shift_lognormal distribution is also called “ex-Gaussian distribution”. I have not used this in Bayesian before but I think you may run some prior predictive checks to get reasonable priors.

If the narrower priors fail to help, I suggest includingExpID as a control variable instead of an additional level. This is kind of another way to estimate the effects that were shared by the three experiments. I guess it is more reasonable to treat the ExpID as an additional level if there are more experiments (say more than 5). Hope it makes sense.

1 Like