Hello,
I’ve been struggling to get stan to sample the joint-posterior of a multi-species occupancy model where the state process follows a inverse cloglog link function and the detection process follows a logit link. The tl;dr question is: are there good rules of thumb for defining (log-scale) intercept/coefficient priors when the response of interest (and the domain knowledge) is on the probability scale?
Full backdrop is that this is intended [foolishly hoped?] to be one component of some multi-species integrated distribution model that has also exhibited severe issues; after breaking it apart piecemeal, the occupancy component seemed to be the source.
The issue manifests in standard–but very loud–ways: divergent transitions nearly every model iteration, and poor mixing by all diagnostic (n.eff’s/r-hats) or visual assessments. I’m semi-confident that the underlying cause is the link being used. After testing different combinations of centered/non-centered, adapt_delta, tweaks to hyper-priors, and link-functions, the big difference maker is that any model where variation in occupancy follows a logit-link samples just fine, and any model where it doesn’t complains loudly.
I’d like to use an inv cloglog approach because it eases integration with different germane data types, presence-background data primarily. I guess I’m suspecting or hoping that part of the issue relates to setting reasonable priors, or priors that correspond to domain knowledge on the probability scale–we’re really more interested in the probability of occupancy than E(N). The full code is below–note, I’d been using a normal prior on the mean log-intercept–N(-1.75, .75), partially because there’s an area offset in the model of log(4) and this combination roughly captures a range of plausible values. Have tried fiddling with the priors, but no luck so far. Is there is some other (left-skewed?) distribution that is more sensible, or any general rules of thumb for defining priors for these models?
Appreciate any thoughts!
John
data {
int<lower=1> nSpec;
int<lower=1> nLocs;
int<lower=1> nCellsOcc;
int<lower=1> nCovs;
int<lower=0, upper=1> ObsOcc[nCellsOcc, nSpec];
int<lower=0> yBBBPA[nLocs, nSpec];
matrix[nCellsOcc, nCovs] X3;
int<lower=1> From[nCellsOcc];
int<lower=1> To[nCellsOcc];
int<lower=1> K[nLocs];
real area;
}
parameters {
matrix[nCovs, nSpec] alpha; //point process intensity;
vector[nSpec] beta; //p for Occ/Det
//hyperparameters
vector[nCovs] mu_alpha;
vector<lower=0>[nCovs] sig_alpha;
real mu_beta;
real<lower=0> sig_beta;
}
transformed parameters {
matrix<lower=0>[nCellsOcc, nSpec] mu;
matrix<lower=0, upper=1>[nCellsOcc, nSpec] psi;
vector<lower=0, upper=1>[nSpec] ppa; //p
for (s in 1:nSpec){
mu[ , s] = exp(X3*alpha[, s]+area);
}
ppa = inv_logit(beta);
psi=1-exp(-1*mu);
}
model {
mu_beta ~ logistic(0, 1);
sig_beta ~ normal(0, 1);
mu_alpha[1]~normal(-1.75, .75);
to_vector(mu_alpha[2:nCovs])~normal(0, 1);
sig_alpha~normal(0, 1);
beta~normal(mu_beta, sig_beta);
for (s in 1:nSpec){
alpha[,s]~normal(mu_alpha, sig_alpha);
}
//likelihood
for (s in 1:nSpec){
for (i in 1:nCellsOcc) {
if (ObsOcc[i,s] == 1) {
target+=bernoulli_lpmf(1|psi[i,s])+binomial_lpmf(y[From[i]:To[i], s]|K[From[i]:To[i]],
ppa[s]);
}else{
target+=log_sum_exp(bernoulli_lpmf(1|psi[i,s])+binomial_lpmf(y[From[i]:To[i], s]|K[From[i]:To[i]], ppa[s]),
bernoulli_lpmf(0|psi[i,s]));
}
}//i
}//s
}
generated quantities {
}