Sampling and identifiability of a spatial model with second-order IGMRF prior

I’m having difficulty getting Stan to sample from a model with a second-order intrinsic Gauss markov random field (IMGRF) prior on a spatial effect (described below) - the data are a spatial point process. It may be because of the inherent difficulties with identifiability of this model but I wanted to see if anyone had had any luck with this sort of approach.

The data are from a simulated point process model using a log Gaussian cox process. The counts of points are aggregated onto a regular lattice over the area of interest, where the counts of cases at grid cell (s_{i}:i=1,...,N)

N(s_{i}) \sim Poisson(\lambda(s_{i}))

where

\lambda(s_{i}) = e(s_{i}) exp(\alpha_{i})

where e is an offset and \alpha_{i} is a spatial effect with a second-order IMGRF prior, so its joint distribution has improper density:

f(\alpha) \propto k^{(n-2)/2}exp(-\frac{k}{2}\alpha^T W \alpha)

where k is a precision parameter and W is a structure matrix with entries w_{ij}=0 if cells i and j are conditionally independent and equal to a particular weight related to the graph if not (from the Rue & Held (2005) book). The posteriors are proper if the priors are proper. The log density is therefore

log(f(\alpha)) = \frac{(n-2)}{2} log(k) -\frac{k}{2}\alpha^T W \alpha + constant

so to code this model in Stan I’ve done (W is passed as a sparse matrix as it is mostly zeroes):

data {
  int N;
  int Nw;
  int Nv;
  int Nu;
  int y[N]; //counts for each cell
  vector[Nw] w; //structure matrix
  int v[Nv]; //structure matrix 
  int u[Nu]; //structure matrix
  vector[N] e; //offset
}
parameters {
  real<lower=0> sigma_x;
  vector[N] x;
}
transformed parameters{
  real<lower=0> prec;
  prec = 1/(sigma_x^2);
}
model {
  vector[N] wx;
  real xWx;
  sigma_x ~ normal(0,0.2);
  
  wx = csr_matrix_times_vector(N,N,w,v,u,x); //Wx
  xWx = x' * wx; //then x' * Wx
  target += ((N-2)/2)*log(prec)-0.5*prec*xWx;
  sum(x) ~ normal(0,0.001*N); //soft sum to zero constraint
  
  y ~ poisson_log(e + x);
  
}

I’ve also added a soft sum-to-zero constraint to help with identifiability, however, the chains just sit on one value and do not converge and there are many divergent transitions. Is this sort of model possible in Stan or is it just going to struggle with identifiability? Or have I made a big coding error?! I’d be very grateful for any insights! I’m using cmdstanr 2.24.

1 Like

Update! I may have solved my own problem so adding an update here in case anyone else is interested. The above model has conditional expectations on the \alpha_{i} parameters given by (where W_0 is the structure matrix from above with a zero diagonal)

E(\alpha_i|\alpha_{-i}) = \frac{1}{\sum_{j=1}^N W_0[i,j]}W_0[i,]\alpha_i

Var(\alpha_i|\alpha_{-i}) = \frac{\sigma^2}{\sum_{j=1}^N W_0[i,j]}

Re-specifying the model in terms of the conditionals above (code below), does sample well at first glance with good mixing of the chains and no divergent transitions. I’m sure there’s a good explanation but my understanding of the geometry of these problems is not good enough.

data {
  int N;
  int Nw;
  int Nv;
  int Nu;
  int y[N];
  vector[Nw] w; //structure matrix
  vector[N] rw; //rowsums
  int v[Nv];
  int u[Nu];
  vector[N] e; //offset
}
parameters {
  real<lower=0> sigma_x;
  vector[N] x;
}
model {
  vector[N] wx;
  sigma_x ~ normal(0,1);
  wx = csr_matrix_times_vector(N,N,w,v,u,x);
  x ~ normal(rw .* wx, rw*sigma_x);
  sum(x) ~ normal(0,0.001*N);
  
  y ~ poisson_log(e + x);
  
}
3 Likes