Estimates from hurdle_lognormal() hurdle and positive components are mirror image

cbrown5 · June 15, 2020, 5:54am

Hi,
I’m fitting a hurdle lognormal model with an interaction like this:

fit1_ann <- brm(bf(biomass ~    depth_cat +  sst_mean_ann + sst_anom_ann + sst_mean_ann:sst_anom_ann,
             hu ~   depth_cat + sst_mean_ann + sst_anom_ann + sst_mean_ann:sst_anom_ann), 
            data = dat, family = hurdle_lognormal())

So both the lognormal and the binomial components have the same covariates.

Odd thing is the marginals for the terms that are interacting (sst mean and sst_anom) appear to be mirror images of each other.
Even stranger is that there is no correlation in the chains when you plot the samples for the terms that look to be reflections.
See below figure. For instance, the b_sst_anom_march (lognormal coefficient) term is pretty much a reflection of the b_hu_sst_anom_march term (coefficient for the logit hurdle). Same goes for the other terms related to sst.

Operating System: Windows 10
brms Version:2.11.1

Rplot01694×497 48 KB

Is this behaviour expected of a lognormal hurdle? I’ve never seen this before in other applications of hurdle models.

Thanks
Chris

Max_Mantei · June 15, 2020, 9:05am

Hm, I’m really not sure, just sharing thoughts… If you have high variance in the outcome variable, that could mean that you’d have a lot of 0’s and a lot of very high values, which could maybe generate the “mirror” pattern. The comparable magnitudes are probably due to same scaling of the independent variables so that in sum it looks like a mirror image… idk. Does the model generate sensible predictions (i.e. posterior predictives)?

cbrown5 · June 16, 2020, 9:05pm

Thanks Max.
That’s good intuition.
The predictions for the expectation (ie probability of occurrence times expected outcome when present) do make sense to me, so that is reassuring. I’m wondering now if/how I can interpret the coefficients.
I ran the models separately, one model of presence/absences with a binomial logit and another with a lognormal just of the positive values.
When I do this all effects are in the same direction (e.g. positive effect of sst_anom_march for both models). So in the lognormal_hurdle, the coefficients for the binomial component seem to be switching signs.
I also re-ran the model with tighter prior (normal(0,2)) on the binomial slope coefficients, that seems to help somewhat. There a few good papers recommending such priors for logit models (with a flat prior the probability density ends up with high density for the binomial probability of 0 or 1).

I also have two random grouping effect in my fits (I excluded this to simplify the question, but seems pertinent now!). I’ll investigate their influence too. Perhaps the lack of all levels of the random effects in the model for the positive outcome (lognormal) is causing some problems.

Model of presence absence :
Rplot_binomial

Lognormal model of positive values:
Rplot_positives

cbrown5 · June 17, 2020, 6:55am

So I’ve resolved this.
I was interpreting the hurdle component back to front.
More negative values mean greater probability of positive values of the response. So its misleading to plot coefficients the way I have, I should reverse the sign of the hurdle coefficients so higher values mean greater probability of positive observations.
Its just a coincidence that the effect sizes in the hurdle component and lognormal component are similar magnitudes.

Here’s a reproducible example

#Positives 
x <- -20:50
n <- length(x)
a <- 0.1
y <- a*x + rnorm(n, sd=1)
plot(x,exp(y))

#Zeros 
a2 <- 0.2
intercept <- -2
#logit transform
prob1 <- exp(a2*x-intercept)/(1 + exp(a2*x-intercept))
plot(x, prob1,
     xlab = "x", ylab = "probability y > 1")

#Create lognormal data with stochastic zeros
resp <- exp(y)
resp[rbinom(n, 1, prob = 1-prob1) == 1] <- 0
plot(x, bio)

dattest <- data.frame(x = x, y = y, resp = resp)

fit1 <- brm(bf(resp ~ x,
               hu ~ x),
            iter = 2000,
            data = dattest, family = hurdle_lognormal())

#Expected outcome
gt <- conditional_effects(fit1, effects = "x")
plot(gt)

#Expected value if resp > 1
gthu <- conditional_effects(fit1, effects = "x",
                            dpar = "mu")
plot(gthu)


#Probability resp >1
gtmu <- conditional_effects(fit1, effects = "x",
                            dpar = "hu")
plot(gtmu)

plot(fitted(fit1, dpar = "hu")[,1], dattest$resp==0,
     xlab= "Predicted prob response = 0",
     ylab = "Response")


summary(fit1)

Topic		Replies	Views
Brms: does the lognormal part of the hurdle_lognormal() regression include zeros into analysis? Modeling	8	2242	September 27, 2020
Interpretation of coefficients in hurdle negative binomial models Modeling brms	3	895	December 15, 2023
Interpreting summary of hurdle_lognormal model brms interpret-results	2	1857	April 24, 2019
Priors for hurdle models in brms Modeling priors , brms	2	755	August 21, 2023
Interpret hurdle coefficients in brms Modeling interpret-results , brms	1	1306	August 11, 2021

Estimates from hurdle_lognormal() hurdle and positive components are mirror image

Related topics