Brms meta-analysis - how to include standard errors of zero?

Hi everyone,

I am building a model an additive model that incorporates standard errors (SE) and regression weight, which are the sample sizes. The data I am using come from the literature, so the approach is meta-analytic. Some authors published their raw data, in order words, individual observations with sample size = 1 and standard error = 0. Some authors reported average +/- SE with sample size > 1.

The problem is that the family of distributions I am using is gaussian with a log link, and an error arises when I input SEs of 0. Here is the structure of my model and the output when I try to run it:

fit1 <- brm(response|weights(Weight) + se(SE) ~ X1 + X2 + X3..., family = gaussian(link="log"), data = Data)

Output:

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Scale parameter is 0, but must be > 0!  (in 'model125839f2ad0_c2a2bf402f16cab2ec43c0c861aee5d3' at line 40)

I am wondering if there is some way to combine averages and individual samples in brms, or if anyone has suggestions on how to move forward?

Thank you for your time!

1 Like

The data looks something like this, hypothetically:

Response         SE          Weight
10                0             1
6                 0             1
15                7             5

Unfortunately, I don’t think combining individual and study-level data can be achieved with brms right now - at least not easily. I think you might be able to use the subset addition term to provide a different formula (without se) for the individual observations. However that would IMHO make the model coefficients differ between the two formulas. You might be able to go around this by rewriting your model via the non-linear syntax (Estimating Non-Linear Models with brms • brms), which let’s you have the same parameter in multiple formulas, but makes the whole thing much more complex. At this point, you are somewhat fighting against brms and I am not sure if implementing the model directly in Stan wouldn’t be easier.

I’ll tag @paul.buerkner for second opinion (as he’s also AFAIK working on metanalyses in brms).

Best of luck with your project!

Thank you for your input, Mr. Modrak! Unless Mr. Burkner has other suggestions, we might end up doing it in a roundabout way I believe similar to what you proposed, or averaging the individual observations while making sure sample sizes are as small as can be.

1 Like
fit1 <- brm(response|weights(Weight) + se(SE, sigma = TRUE) ~ X1 + X2 + X3..., family = gaussian(link="log"), data = Data)

could work. that way the scale will not be zero as you still have a residual standard deviation parameter (which would probably want anyway). Note that, In classical meta analysis with one data row (effect size) per study, sigma = TRUE would replace the observation-level random effect as both measure the residual SD and would not be jointly identified.

1 Like

Thank you for your response, Dr. Buerkner. I tried the code above and obtained the following output:
“The model does not contain posterior samples.”
I am wondering how to resolve this issue?

This does not really tell much. Can you try with one chain and show all the messages that (r)stan prints out?

Hi Dr. Burkner, here is the format of the code I ran:

fit1 <- brm(response|weights(Weight) + se(SE, sigma=TRUE) ~ 
X1 + X2 + X3 + ..., family = gaussian(link="log"),chains=1,iter=4000,warmup=500,data = Data, control = list(adapt_delta = 0.99, max_treedepth = 17))

And the output I got was:

Warning messages:
1: There were 500 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 17. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded 
2: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low 
3: Examine the pairs() plot to diagnose sampling problems
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

Seems like it should work if I just increase max_treedepth? This output is different from last time - previously, when I ran with chains = 4, I didn’t get any warning messages at all but also no posterior samples were drawn. I think the model might be okay now!

Good news, Dr. Burkner: I re-ran with max_treedepth = 18 and obtained no warning messages. Also the model sampled successfully and all Rhats were 1.00. Then my question is why does the model not work for chains = 4?

I tried running using chains = 3 and got this warning message:

Warning messages:
1: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low 
2: Examine the pairs() plot to diagnose sampling problems

But besides this, it seems the model worked. The next step is to use this code on my real data, and I hope it goes well. For now, I think this issue is resolved! Thank you again for your help.

You should still look at Rhat and ESS etc. and perhaps trace plots (via plot()) or similar but it seems the model at least runs now :-)

Hi Dr. Burkner, I just want to confirm one more thing: setting sigma = TRUE does not replace the SEs we give the model? In other words, it is there to help the model run, but it does not replace the information we give the model? @paul.buerkner

it just estimates a residual standard deviation parameter in addition to the known SEs. something you would want in your model anyway.

Great, thank you!

I have what I think is a parallel problem, that I think is very prevalent in biomedical research.

I want to analyse a multi-level data-set that includes both nested and crossed random effects. The data are from an experiment to measure the intensity of a marker in individual living cells. We observe cells occurs in “subplots” (microscope fields, where each field may show several cells), which are samples of plots (culture dishes where the cells grow, that usually contain hundreds of cells). We measure the intensity of the marker (in individual cells, nested in subplots, within plots) at different times. The cells are from different people (blocks) and we sometimes measure people’s cells at different times - but there is partial overlap between people and times. The cells that we measure for each dish and person are not the same from time to time.

It’s a complicated and unbalanced design - but it reflects the realities of when cells become available and how many of them grow at different times in different culture dishes and different microscope fields (which we do not control). As I say, complex data of this kind are very common in biomedical research. We ignore differences between microscope fields (subplots) for now.

There are very many cells in the total study (~200k) - and some individual microscope fields have no cells, so that estimation of the full model for all the data, using the zero-inflated negative binomial family is practically impossible (it takes several hours to become reasonably stable using variational Bayes, but even if I use the estimates from the vB output as starting values for estimation using NUTS, the warm-up phase ran for 72 hours without reporting any progress at all).

Since I cannot analyse the raw data, I wondered about using a meta-analysis approach, in line with the method suggested previously in this post. Specifically, I obtained the mean and standard deviation of the intensity in each culture dish at each time, along with the number of cells where we measured the intensity. So, after taking the mean intensity per cell and its standard deviation for each dish, the data look something like this:-

Person culture_dish time Total_N_of_cells mean_intensity_per_cell sd_of_intensity_per_cell
1 1 1 123 3.5 1.1
1 2 2 322 4.2 1.5
2 1 2 126 2.1 0.6
2 2 1 139 1.4 0.4
etc. etc. (there are roughly 200 cells per person - so the total data-set has about 1000 observations).

I want to estimate the mean_intensity_per_cell for each person, taking account - as far as possible - the variation between culture dishes and times. I take it that the problem shows formal similarity to Xiazhu111’s question. That is, it seems reasonable (to me) to conceptualise the design as a meta-analysis, where each person constitutes an ‘experiment’. The main difference from XiaZhu111’s original post is that there are no culture dishes with no cells, so every culture dish has a real/positive standard deviation for the mean_intensity_per_cell. Given that there are no standard deviations of zero, following Paul Buerker’s post (above), I wrote the following brmsformula to separate the effects of individual people from the other design factors:-

mean_intensity_per_cell|se(sd_of_intensity_per_cell/sqrt(Total_N_of_cells+1))+weights(Total_N_of_cells)~1+(1|Person)+(1|culture_dish)+(1|time), family=gaussian(link=‘log’)

(I use family=gaussian(link=‘log’) for 3 reasons: (1) se() is not available with family=lognormal; (2) the distribution of mean_intensity_per_cell is actually right-skewed; (3) Paul Buerkner used it in his previous post).

So, now I would like to ask:-

  1. Does this formula seem appropriate to take into account both the variation in the mean_intensity_per_cell and the Total_N_of_cells between culture dishes?
  2. is it reasonable to use family=gaussian(link=‘log’) to account for the skewness of the data (mean_intensity_per_cell), or would it be better to take log(mean_intensity_per_cell) and use family=gaussian, or family=student?
  3. If I use log(mean_intensity_per_cell), then should I also log-transform its standard error and the Total_N_of_cells?

I think it likely that there are many biomedical researchers whose data are similar to those that I describe here. Therefore, figuring out the best way to analyse these complex, unbalanced and numerous data (initial N~200k, or more) could be very helpful to many other researchers.

With many thanks, in anticipation of your help