Hello all,
I am trying to model a continuous proportion (fire coverage) with a decent number of 0 observations (i.e., y ∈ [0,1) ), and (in addition to covariate effects on the values of non-zero observations) I am interested in the influence of covariates on the probability of obtaining a 0 value, hence I am using zero-inflated beta regression. My data has both a temporal and spatial structure: I have a landscape which I have divided into grids, and then extracted data on the proportional area burnt for each grid in each year over a ~30 year time series.
I don’t feel I need to account for temporal autocorrelation, as many of my predictor variables should, in theory, be the factors driving any apparent temporal autocorrelation. I am also including Year as a random effect in the mu, phi and zi models. The problem arises with spatial autocorrelation. I had been including an ESICAR term in the mu model (initially using brms):
forest_fire_perc_formula <-
bf(Forest_Percent_Burnt ~ (1|Year) + Predictors... + car(adj_mat, gr = Landscape_ID, type = 'esicar'),
phi ~ (1|Year),
zi ~ (1|Year) + Predictors...,
decomp = 'QR')
The predictors are the same in the mu and zi components if it is of any relevance. adj_mat is a N x N binary matrix indicating adjacencies (1s), where N represents the number of independent sites, regardless of year (i.e., the number of grid cells). I am only including spatial autocorrelation in the main model, following this paper Zero-Inflated Beta Distribution Regression Modeling | SpringerLink. That paper uses a gaussian process however, is it reasonable to use a CAR for this purpose?
The problem is, this format results in an identical value of the spatial autocorrelation random effect for each landscape across years. For instance, for the first grid cell (Landscape_ID = 1) the 30 values of rcar (one per year) are identical. This does not align with what I intended. As fires are highly seasonal in my landscape, I’m assuming there is only really spatial autocorrelation within years, and I would thus expect the values of rcar to vary both spatially and temporally.
Thus, I had a go at amending the STAN code to index the ESICAR term by year, placing a hyperprior on the SD parameter:
functions {
real zero_inflated_beta_lpdf(real y, real mu, real phi, real zi) {
row_vector[2] shape = [mu * phi, (1 - mu) * phi];
if (y == 0) {
return bernoulli_lpmf(1 | zi);
} else {
return bernoulli_lpmf(0 | zi) +
beta_lpdf(y | shape[1], shape[2]);
}
}
real zero_inflated_beta_logit_lpdf(real y, real mu, real phi, real zi) {
row_vector[2] shape = [mu * phi, (1 - mu) * phi];
if (y == 0) {
return bernoulli_logit_lpmf(1 | zi);
} else {
return bernoulli_logit_lpmf(0 | zi) +
beta_lpdf(y | shape[1], shape[2]);
}
}
real zero_inflated_beta_lccdf(real y, real mu, real phi, real zi) {
row_vector[2] shape = [mu * phi, (1 - mu) * phi];
return bernoulli_lpmf(0 | zi) + beta_lccdf(y | shape[1], shape[2]);
}
real zero_inflated_beta_lcdf(real y, real mu, real phi, real zi) {
return log1m_exp(zero_inflated_beta_lccdf(y | mu, phi, zi));
}
real sparse_icar_lpdf(vector phi, real sdcar, int Nloc,
int Nedges, data vector Nneigh, data vector eigenW,
array[] int edges1, array[] int edges2) {
real tau; // precision parameter
row_vector[Nloc] phit_D; // phi' * D
row_vector[Nloc] phit_W; // phi' * W
tau = inv_square(sdcar);
phit_D = (phi .* Nneigh)';
phit_W = rep_row_vector(0, Nloc);
for (i in 1:Nedges) {
phit_W[edges1[i]] = phit_W[edges1[i]] + phi[edges2[i]];
phit_W[edges2[i]] = phit_W[edges2[i]] + phi[edges1[i]];
}
return 0.5 * ((Nloc - 1) * log(tau) -
tau * (phit_D * phi - (phit_W * phi)));
}
}
data {
int<lower=1> N; // total number of observations
vector[N] Y; // response variable
int<lower=1> Nyrs; // total number of years
array[N] int<lower=1> obsYr; // year index of each observation
int<lower=1> K; // number of population-level covariates for main model
matrix[N, K] X; // population-level design matrix for main model
int<lower=1> K_zi; // number of population-level covariates for zero-inflation component
matrix[N, K_zi] X_zi; // population-level design matrix for zero-inflation component
// data for the CAR structure
int<lower=1> Nloc; // number of sampled locations (time invariant)
array[N] int<lower=1> Jloc; // location index of each observation
int<lower=0> Nedges; // total number of edges among all observations (time invariant)
array[Nedges] int<lower=1> edges1; // index of first cell in pairs that share edge (time invariant)
array[Nedges] int<lower=1> edges2; // index of second cell in pairs that share edge (time invariant)
vector[Nloc] Nneigh; // number of neighbours for each site (time invariant)
vector[Nloc] eigenMcar; // eigen values for each site (time invariant)
int prior_only; // should the likelihood be ignored? used for prior predictive checks
}
transformed data {
int Kc = K - 1; // number of linear terms minus intercept
matrix[N, Kc] Xc; // centered version of X without an intercept
vector[Kc] means_X; // column means of X before centering
// matrices for QR decomposition
matrix[N, Kc] XQ;
matrix[Kc, Kc] XR;
matrix[Kc, Kc] XR_inv;
int Kc_zi = K_zi - 1; // number of linear terms minus intercept
matrix[N, Kc_zi] Xc_zi; // centered version of X_zi without an intercept
vector[Kc_zi] means_X_zi; // column means of X_zi before centering
// scale and centre predictor variables
for (i in 2:K) { // standard component
means_X[i - 1] = mean(X[, i]);
Xc[, i - 1] = X[, i] - means_X[i - 1];
}
for (i in 2:K_zi) { // zero-inflated component
means_X_zi[i - 1] = mean(X_zi[, i]);
Xc_zi[, i - 1] = X_zi[, i] - means_X_zi[i - 1];
}
// compute and scale QR decomposition
XQ = qr_thin_Q(Xc) * sqrt(N - 1);
XR = qr_thin_R(Xc) / sqrt(N - 1);
XR_inv = inverse(XR);
}
parameters {
vector[Kc] bQ; // regression coefficients at QR scale
real Intercept; // temporary intercept for centered predictors
real sdcar_mean; // mean of hyperparameter for SD of year-level CAR structures
real<lower=0> sdcar_sd; // sd of hyperparameter for SD of year-level CAR structures
array[Nyrs] real<lower=0> sdcar; // SD of the year level CAR structures
array[Nyrs] vector[Nloc - 1] zcar;
real Intercept_phi; // temporary intercept for centered predictors
vector[Kc_zi] b_zi; // population-level effects
real Intercept_zi; // temporary intercept for centered predictors
real<lower=0> sd_1; // group-level standard deviation for main model
vector[Nyrs] z_1; // standardized group-level effects for main model (year random effect)
real<lower=0> sd_2; // group-level standard deviation for phi model
vector[Nyrs] z_2; // standardized group-level effects for phi model (year random effect)
real<lower=0> sd_3; // group-level standard deviation for zi model
vector[Nyrs] z_3; // standardized group-level effects for zi model (year random effect)
}
transformed parameters {
array[Nyrs] vector[Nloc] rcar; // CAR random effect values for each observation
vector[Nyrs] r_1_1; // actual group-level effects
vector[Nyrs] r_2_phi_1; // actual group-level effects
vector[Nyrs] r_3_zi_1; // actual group-level effects
// sum-to-zero constraint
for (y in 1:Nyrs) {
rcar[y][1:(Nloc - 1)] = zcar[y];
rcar[y][Nloc] = - sum(zcar[y]);
}
// unstandardize group-level effects
r_1_1 = (sd_1 * (z_1));
r_2_phi_1 = (sd_2 * (z_2));
r_3_zi_1 = (sd_3 * (z_3));
}
model {
real lprior = 0; // prior contributions to the log posterior
// likelihood not including constants
if (!prior_only) {
// initialize linear predictor term
vector[N] mu = rep_vector(0.0, N);
vector[N] phi = rep_vector(0.0, N);
vector[N] zi = rep_vector(0.0, N);
mu += Intercept + XQ * bQ;
phi += Intercept_phi;
zi += Intercept_zi + Xc_zi * b_zi;
for (n in 1:N) {
// add more terms to the linear predictor
mu[n] += rcar[obsYr[n]][Jloc[n]] + r_1_1[obsYr[n]];
phi[n] += r_2_phi_1[obsYr[n]];
zi[n] += r_3_zi_1[obsYr[n]];
}
mu = inv_logit(mu);
phi = exp(phi);
for (n in 1:N) {
target += zero_inflated_beta_logit_lpdf(Y[n] | mu[n], phi[n], zi[n]);
}
}
// priors not including constants
lprior += normal_lupdf(bQ | 0,1); // normal(0,1) prior on all sloped
lprior += logistic_lupdf(Intercept | -2,0.5); // moderately informative prior on standard intercept; fire cover low in almost all instances
lprior += normal_lupdf(sdcar_mean | 0,1); // hyperparamater components for temporally variable CAR structure
lprior += cauchy_lupdf(sdcar_sd | 0,2);
for(y in 1:Nyrs) {
lprior += cauchy_lupdf(sdcar[y] | sdcar_mean,sdcar_sd);
}
lprior += normal_lupdf(Intercept_phi | 0,1);
lprior += normal_lupdf(b_zi | 0,1);
lprior += logistic_lupdf(Intercept_zi | 0, 1);
lprior += cauchy_lupdf(sd_1 | 0,2);
lprior += cauchy_lupdf(sd_2 | 0,2);
lprior += cauchy_lupdf(sd_3 | 0,2);
target += lprior;
for(y in 1:Nyrs) {
target += sparse_icar_lpdf(
rcar[y][1:Nloc] | sdcar[y], Nloc, Nedges, Nneigh, eigenMcar, edges1, edges2
);
}
target += std_normal_lupdf(z_1);
target += std_normal_lupdf(z_2);
target += std_normal_lupdf(z_3);
}
generated quantities {
// obtain the actual coefficients
vector[Kc] b = XR_inv * bQ;
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
// actual population-level intercept
real b_phi_Intercept = Intercept_phi;
// actual population-level intercept
real b_zi_Intercept = Intercept_zi - dot_product(means_X_zi, b_zi);
}
This took a very long time to run (1500 warm up, 1500 sampling) and did not converge at all, making me think perhaps I am misunderstanding the premise of CAR models, and they should not vary according to another factor.
I have also tried:
Providing adj_mat as a K x K matrix, where K is the total number of samples (i.e., n_grid_cells * n_years) and samples can only ever be adjacent to samples from the same year, but this did not converge either.
Just using a standard prior on the year-level sdcars (i.e., lprior += cauchy_lupdf(sdcar[y] | 0,2); ). But this also did not converge
Ultimately my questions are:
Is a ESICAR term appropriate in this context?
Should/can the ESICAR vary with year?
If the answer to these is yes, what is wrong in my code that is preventing convergence?
Thanks very much for any help,
Ciar