Brms marginal_effects plots not reflecting model estimates

Hi All,

I’m hoping that somebody can help me with an issue. I’m running a lognormal model in brms to measure the population level effects of covariates on my response variable, which is ‘detection delay’. Among the covariates in the model is ‘endemicity’, which refers to one of two datasets that I’m trying to compare through the model estimates (low vs. high).

The estimates of the posterior mean and diagnostics all check out, and I exponentiate the estimates to give the relative differences for each covariate. However, when I generate two and three way effect plots using the marginal_effects function, the output doesn’t match the model estimates (see estimates and plots below), i.e. the detection delay estimates in the ‘high endemicity’ dataset are lower than the ‘low endemicity’ dataset, which is the opposite of what I would expect from the model estimates (this is using the default method: posterior_epred, but is also the case with method = posterior_linpred).

Interestingly, I have only experienced this issue since incorporating heterogeneous variances
for the two data sets in the model, i.e. including sigma ~ 0 + endemicity. Without specifying sigma, the differences in expected mean response values of the PPD in the effect plots reflect the model estimates, with ‘high endemicity’ showing higher detection delay overall regardless of the other covariates shown in the effect plot.

Could anyone help me explain why this is happening?

@paul.buerkner @LucC

mod_brmslognorm_final <- brm(detection delay ~ age +
                                     endemicity +
                                     sex +
                                     sigma ~ 0 + endemicity),
                                     iter = 10000,
                                     warmup = 1000,
                                     init = 0,
                                     prior <- c(set_prior("normal(0,1e+06)", class = "b")),
                                     data = mydata_full, 
                                     family = "lognormal")

Operating System: Windows 10 Pro
brms Version: 2.17.0

Hi @TomH,

Recap of the live discussion we had about this:

The exponent of coefficients in a log-normal regression model (with linear predictor on the log scale) represent the expected relative change in the outcome for every unit change in a predictor. For this type of model, the expectation, when based on the model.coefficients only, is the geometric mean. In other words, your model indicates that the expected detection delay in terms geometric mean is higher in highly endemic areas.

We hypothesised that the marginal effects plot is plotting the expectation of the log-normal distribution in terms of the arithmetic mean, which is exp(log(\mu_{geom}) + \sigma^2_2 / 2). Because \sigma^2_2 varies by value of the endemicitt predictor, we thought that the marginal effects plot took this into account when plotting the expectations. But then the plot on the log scale (“method = posterior_linpred”) should not account for this, presumably. But the latter plot also showed this unexpected opposite pattern compared to the model coefficients.

@paul.buerkner any idea what might be going on? What is the marginal effects function plotting exactly when it comes to expectations of log-normal models?

By default, we will use method = posterior_epred which is indeed the (arithmetic) mean of the response distribution, that also takes into account sigma, as you say. If method = posterior_linpred we should see predictions of the main parameter “mu” by default, which is \log(\mu_{\rm geom}) in your notation. If you think this is not the case, I would need to see a reprex that demonstrates this.

Thanks very much for clarifying @paul.buerkner, we’ve figured it out now. Using method = posterior_linpred indeed gives us the geometric mean of the response distribution, the exponent of which gives an expected detection delay that is higher in highly endemic areas as indicated by the model coefficients.