Hurdle model with BYM2 correlation structure on hurdle component

Hi,

If I have a negative binomial hurdle model (positions of stationary schools of fish on a coral reef with the count being the number of fish in the schools and the presence/absence of a school being the binomial component) is it possible using BRMS to apply a bym2 autocorrelation structure to the binomial portion of the model rather than the count portion?

Thanks!
Jason

  • Operating System: Ubuntu
  • brms Version: ‘2.10.0’
1 Like

Hi,
I am not sure what you mean by

could you elaborate? I particularly don’t understand how you would separate the “binomial” and “count” portions of “negative binomial”. Preferably if you have a mathematical formula or a reference to a paper that uses model similar to what you are trying to do.

Best of luck with your model!

1 Like

Hi yes sorry I can certainly try to explain this better.

My understanding is that with a hurdle model the way brms incorporates a BYM2 autocorrelation is by adding the ICAR spatial component like this (spatial component represented by ϕ and the subscript i indexing through the data with subscripts c and z just showing if the parameter is for the count or the hurdle model components:

N_i ~ ZANB(μ_i,π_i,θ)
log(μ_i )=α_c+β_c X_i+ϕ
logit(π_i )=α_z+β_z X_i

And in the stan code created by brms I believe that’s encoded here:

vector[N] mu = Intercept + Xc * b;
// initialize linear predictor term
vector[N] hu = Intercept_hu + Xc_hu * b_hu;

for (n in 1:N) {
    // add more terms to the linear predictor
    mu[n] += rcar[Jloc[n]];
}

target += hurdle_neg_binomial_log_logit_lpmf(Y[n] | mu[n], shape, hu[n]);

What I think I’d like to do is move the spatial component (ϕ) from influencing μ to influencing π. My reasoning for this is that in the system I’m looking at if a spot doesn’t have a school of fish then the neighboring spots seem very likely to also not have a school. However, the size of a school that is present doesn’t seem to affect the size of neighboring schools terribly much.

Mathematically I believe that would translate to simply moving the ϕ and ending up with this:
N_i ~ ZANB(μ_i,π_i,θ)
log(μ_i )=α_c+β_c X_i
logit(π_i )=α_z+β_z X_i+ϕ

Which I assume in the stan code would translate to a change to this:

vector[N] mu = Intercept + Xc * b + offset;
// initialize linear predictor term
vector[N] hu = Intercept_hu + Xc_hu * b_hu;

for (n in 1:N) {
    // add more terms to the linear predictor
    hu[n] += rcar[Jloc[n]];
  }

target += hurdle_neg_binomial_log_logit_lpmf(Y[n] | mu[n], shape, hu[n]);

My understanding is that the zero hurdle negative binomial model is a mixture of the binomial model of the presence/absence of a school of fish and when a school is present the negative binomial model of the counts. I can make each of those model separately but I’d rather produce a single model since the number of fish present in a location is inherently dependent on whether or not there are any fish in that location and so should be modeled as one process.

I am not 100% sure, but I believe distrubutional models might be the way to go - it should allow you to have a separate formula for the hurdle and for the mean. Does it look like the solution to your problem?

Hope that helps!

Thanks! Unfortunately, I don’t think the distributional models method will help with the spatial component because the way to code the BYM2 autocorrelation into the model with brms is separate from the model formula. This is the brms code I’m using to produce the model at the moment:

brm(bf(formula(count ~ XC),
       formula(hu ~ XZ)), 
    
    autocor = cor_car(adjacency_matrix, 
                      formula = ~1 | grid_cell, 
                      type = 'bym2'),
    data = dat,
    family = 'hurdle_negbinomial')

So I can separate the formulas for mean/hurdle in terms of covariates but I’m unsure if there is a way to do that with the autocorrelation structure.

According to https://rdrr.io/cran/brms/man/car.html you should be able to have hu ~ XZ + car(adjacency_matrix, gr = grid_cell, type = 'bym2') and the syntax you use was deprecated in the current brms version.

2 Likes

hu ~ XZ + car(adjacency_matrix, gr = grid_cell, type = 'bym2') looks correct to me too.

1 Like

Awesome thank you both!! I hadn’t seen that part of how to code it!

Thank you!