Hello Stan world,
I am working on tree disease. My current research consist in extending an existing hierarchical model to predict disease propagation within tree plantations. The previous model was implemented in INLA and included among other things a Conditionally AutoRegressive (CAR) component. To get my hands on Stan and tame it with a simple problem, I decided first to simulate a simple CAR process and model it in Stan to see if I can find back the values of the parameters I set myself, and how sensitive the model was to different priors. However things did not work as expected and I am having trouble with the reparametrization of the model.
The model is as follows:
y \sim N(X \beta + \phi, \sigma^2_\epsilon )
\beta \sim N(0, 4)
X in that case was just vector of size n, the number of obersvation, of random draws from N \sim (2,2)
\phi is the CAR component depending on two main parameters \alpha \in [0 \;;1] the spatial dependency and \tau the precision (\tau = 1 / \sigma). Cf Max Joseph’s tutorial.
\phi \sim N(0, [\tau ( D-\alpha W ) ]^-1)
\alpha \sim N(0.5, 0.5)
\tau \sim \gamma(2,2)
And the data was simulated using the following code:
## Set up a square lattice region
# lattice side length
m <- 10
# number of locations
n <- m^2
x.easting <- 1:m
x.northing <- 1:m
Grid <- expand.grid(x.easting, x.northing)
## Set up distance and neighbourhood (W, based on sharing a common border) matrices
distance <-array(0, c(n,n))
W <-array(0, c(n,n))
distance <- as.matrix(dist(Grid))
W[distance == 1] <- 1
## Calculate D matrix from W
D <- diag(rowSums(W), n, n)
## Simulate phi
# Define CAR parameters
alpha <- 0.5
tau <- 2
# Define precision and covariance matrices
SIG_inv <- tau * (D - alpha * W) # Precision matrix
SIG <- solve(SIG_inv) # Covariance matrix
phi <- mvrnorm(n = 1, mu = rep(0, n), Sigma = SIG) # Spatial effect using CAR model
beta <- 2.5 # Coefficient
x <- rnorm(n, 2, 2) # Predictor
se2 <- 2 # residual var
e <- rnorm(n, 0, sqrt(se2)) # residuals
ynorm <- beta*x + phi + e
And then the Stan code
data {
int<lower = 1> n; // Nb of locations
vector[n] x; // predictor
vector[n] y; // response
matrix<lower = 0, upper = 1>[n,n] W;// Adjacency matrix
}
transformed data{
vector[n] zeros;
matrix<lower = 0>[n, n] D;
{
vector[n] W_rowsums;
for (i in 1:n) {
W_rowsums[i] = sum(W[i, ]);
}
D = diag_matrix(W_rowsums);
}
zeros = rep_vector(0, n);
}
parameters{
real beta;
real<lower = 0> sigmaepsilon;
vector[n] phi;
real<lower = 0> tau;
real<lower = 0, upper = 1> alpha;
}
model{
beta ~ normal(0,4);
phi ~ multi_normal_prec(zeros, tau*(D - alpha*W));
alpha ~ normal(0.5, 0.5);
tau ~ gamma(2, 1);
y ~ normal(x * beta + phi, sigmaepsilon);
}
## Fitting the stan model
dat <- list(n = n, x = x, y = ynorm, W = W, D = D)
niter <- 2E3
nchains <- 4
system.time(fit <- stan('simu_CAR_norm.stan', data = dat,
iter = niter, chains = nchains, verbose = FALSE))
I did not expect any problem as I was first restraining myself to an ideal simple case, but I kept running into a low BFMI problem. This appears to be mainly due to \tau as it displays a strong negative correlation with energy__
. (cf Brief Guide to Stan’s Warnings.). \tau is also poorly estimated. Both posetrior mean and median are around 1 where as the real value of \tau is 2.
As increasing the niter
to 10.000 improved nothing, I am now trying to reparametrize my model. I found some inspiration in the INLA documentation. The latent model "generic1"
is somewhat similar to a CAR model and uses a log transform on \tau \; ; \; \theta_1 = log(\tau) and logit transform on \alpha \; ; \; \theta_2 = logit(\alpha).
And that’s where I am confused. I tried first to simply log-transform \tau,
following the Stan Manual version 2.17.0 p291 :
transformed parameters{
real<lower = 0> log_tau;
log_tau = log(tau);
}
model{
beta ~ normal(0,4);
phi ~ multi_normal_prec(zeros, tau*(D - alpha*W));
alpha ~ normal(0.5, 0.5);
log_tau ~ gamma(2, 1);
target += -log_tau;
y ~ normal(x * beta + phi, sigmaepsilon);
}
I am getting an error message from the PARSER :
Warning (non-fatal):
Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable. If it does, you need to include a target += statement...
Which I, precisely, exepected to avoid, but worse than that, I am still facing a low BFMI problem, and in addition wind up with a number of divergent transitions (311 for niter = 2000). Nb : This number doesn’t decrease drastically if I increase adapt_delta
to 0.90
.
I then tried to replace this two lines
log_tau ~ gamma(2,2);
target += -log_tau;
by
target += gamma_lpdf(log_tau | 2, 2);
Which doesn’t not improve anything, (still encountering both divergence and energy problems)
Bottom line : How should I properly log-transform \tau to solve my problem ? or is it even a sound approach to do that ?
Thanks in advance