Hi all, I’m trying to validate hierarchical Gaussian process regressions using the SBC package. My real-world dataset is vegetation data, which was collected at many locations within many plots, and I’m interested in estimating plot-level means and local spatial structure. The basic format of the model is written below:
\begin{aligned} y_{x,p} &\sim \text{NegBinomial}(\mu_{x,p}, \theta) \\ \text{log}(\mu_{x,p}) &= \alpha_0 + \alpha_p + GP_{x,p} \\ \alpha_p &\sim \text{Normal}(0, \sigma)\\ GP_{x,p} &\sim \text{MultiNormal}(0, K_{\eta, \rho}(x)) \\ k(x_i, x_j|\eta, \rho) &= \eta^2 \text{exp}(-\frac{(x_i - x_j)^2}{2\rho^2}) \end{aligned}
where y_xp are the counts of data at location x in plot p.
To validate this model, I generated fake data in R below:
NB_apGP <- function(P, coords){
# kernel function
fkernel <- function(d, eta, rho) eta^2 * exp(-(d^2) / (2 * rho^2))
# data
L <- nrow(coords)
plotID <- rep(1:P, each = L)
locID <- rep(1:L, times = P)
N <- length(plotID)
# priors
a0 <- rnorm(1, 0, 1)
sigma_ap <- abs(rnorm(1, 0, 1))
theta <- rlnorm(1, log(.5), .7)
eta <- rgamma(1, 5, 20)
rho <- pscl::rigamma(1, 2.4, 7.3)
# deal with the GP
dmat <- fields::rdist(coords)
K <- fkernel(dmat, eta, rho)
# cholesky decomposition
zGP <- matrix(rnorm(L * P), nrow = L, ncol = P)
diag(K) <- diag(K) + 1e-9
GP <- t(chol(K)) %*% zGP
# mean model
ap <- rnorm(P, 0, sigma_ap)
mu <- rep(NA, N)
for(i in 1:N){
mu[i] <- exp(a0 + ap[plotID[i]] + GP[locID[i], plotID[i]])
}
y <- rethinking::rgampois2(N, mu, theta)
# outputs
list(
variables = list(
a0 = a0,
sigma_ap = sigma_ap,
theta = theta,
eta = eta,
rho = rho
),
generated = list(
N = N,
P = P,
L = L,
plotID = plotID,
locID = locID,
y = y,
coords = coords
)
)
}
with the accompanied Stan model and SBC script attached.
NB_apGP.stan (1.1 KB)
SBC_NegBinom_forDiscourse.R (1.7 KB)
I’ve only run 10 simulations, but it’s clear that the model is having a hard time estimating eta and rho (*kappa is supposed to be theta in the figure). The Rhats are often >1.01 for rho too.
Is it just a matter of altering my priors, is there something fundamentally wrong with my model structure, or is there something else that I can’t think of? I’ve tried a poisson likelihood too, thinking that the model would do better if it had one less dispersion parameter, but it also has a hard time estimating rho.
Input very appreciated!