Estimates from hurdle_lognormal() hurdle and positive components are mirror image

I’m fitting a hurdle lognormal model with an interaction like this:

fit1_ann <- brm(bf(biomass ~    depth_cat +  sst_mean_ann + sst_anom_ann + sst_mean_ann:sst_anom_ann,
             hu ~   depth_cat + sst_mean_ann + sst_anom_ann + sst_mean_ann:sst_anom_ann), 
            data = dat, family = hurdle_lognormal())

So both the lognormal and the binomial components have the same covariates.

Odd thing is the marginals for the terms that are interacting (sst mean and sst_anom) appear to be mirror images of each other.
Even stranger is that there is no correlation in the chains when you plot the samples for the terms that look to be reflections.
See below figure. For instance, the b_sst_anom_march (lognormal coefficient) term is pretty much a reflection of the b_hu_sst_anom_march term (coefficient for the logit hurdle). Same goes for the other terms related to sst.

Is this behaviour expected of a lognormal hurdle? I’ve never seen this before in other applications of hurdle models.


Hm, I’m really not sure, just sharing thoughts… If you have high variance in the outcome variable, that could mean that you’d have a lot of 0’s and a lot of very high values, which could maybe generate the “mirror” pattern. The comparable magnitudes are probably due to same scaling of the independent variables so that in sum it looks like a mirror image… idk. Does the model generate sensible predictions (i.e. posterior predictives)?

Thanks Max.
That’s good intuition.
The predictions for the expectation (ie probability of occurrence times expected outcome when present) do make sense to me, so that is reassuring. I’m wondering now if/how I can interpret the coefficients.
I ran the models separately, one model of presence/absences with a binomial logit and another with a lognormal just of the positive values.
When I do this all effects are in the same direction (e.g. positive effect of sst_anom_march for both models). So in the lognormal_hurdle, the coefficients for the binomial component seem to be switching signs.
I also re-ran the model with tighter prior (normal(0,2)) on the binomial slope coefficients, that seems to help somewhat. There a few good papers recommending such priors for logit models (with a flat prior the probability density ends up with high density for the binomial probability of 0 or 1).

I also have two random grouping effect in my fits (I excluded this to simplify the question, but seems pertinent now!). I’ll investigate their influence too. Perhaps the lack of all levels of the random effects in the model for the positive outcome (lognormal) is causing some problems.

Model of presence absence :

Lognormal model of positive values:

So I’ve resolved this.
I was interpreting the hurdle component back to front.
More negative values mean greater probability of positive values of the response. So its misleading to plot coefficients the way I have, I should reverse the sign of the hurdle coefficients so higher values mean greater probability of positive observations.
Its just a coincidence that the effect sizes in the hurdle component and lognormal component are similar magnitudes.

Here’s a reproducible example

x <- -20:50
n <- length(x)
a <- 0.1
y <- a*x + rnorm(n, sd=1)

a2 <- 0.2
intercept <- -2
#logit transform
prob1 <- exp(a2*x-intercept)/(1 + exp(a2*x-intercept))
plot(x, prob1,
     xlab = "x", ylab = "probability y > 1")

#Create lognormal data with stochastic zeros
resp <- exp(y)
resp[rbinom(n, 1, prob = 1-prob1) == 1] <- 0
plot(x, bio)

dattest <- data.frame(x = x, y = y, resp = resp)

fit1 <- brm(bf(resp ~ x,
               hu ~ x),
            iter = 2000,
            data = dattest, family = hurdle_lognormal())

#Expected outcome
gt <- conditional_effects(fit1, effects = "x")

#Expected value if resp > 1
gthu <- conditional_effects(fit1, effects = "x",
                            dpar = "mu")

#Probability resp >1
gtmu <- conditional_effects(fit1, effects = "x",
                            dpar = "hu")

plot(fitted(fit1, dpar = "hu")[,1], dattest$resp==0,
     xlab= "Predicted prob response = 0",
     ylab = "Response")