Comparison of two model BYM and BYM2

hi everyone, hope you are doing well, I have implemented two models bym and bym2 in open bugs and rstudio ,respectively.
now , what solution do you suggest for me to compare the two models?
considering that each of them was implemented in separate software … can the weighted average of the monte carlo error be enough?
because I want to use it for the article I am writing .
kind regards,sh

I believe the bym2 model is just a reparameterization of the bym model, so any differences in performance that you find will be due to either to computational problems in one or the other, or due to differences in the prior. That doesn’t mean it’s a bad idea to compare them, but is crucial context for interpreting the comparison.

“Monte carlo error” usually refers to the uncertainty in the location of the posterior itself, and provided the MCMC sampler is working well it is brought down by running longer chains. This is not a good metric for comparing models, but I wonder if you meant something else?

There are a lot of tools out there for model comparison, and which one to choose depends importantly on what you hope to learn from the comparison. Often we are interested in out-of-sample predictive error. Does this sound like it captures what you are interested in? If so, spatial cross-validation might be appropriate.

dear @jsocolar thanks for your reply.
I understand what you mean. you are right.
I need to do validation on my model .
I used leave-one-out (LOO) cross-validation (CV) for a hierarchical model.
but I encountered this error. can you guide me ?

> Error in check_pars(allpars, pars) : no parameter log_lik

first I defined log_lik in my model like this and loaded via library(loo) the loo package .

data {
int<lower=0> N;
int<lower=0> N_edges;
int<lower=1, upper=N> node1[N_edges]; // node1[i] adjacent to node2[i]
int<lower=1, upper=N> node2[N_edges]; // and node1[i] < node2[i]

int<lower=0> y[N]; // count outcomes
vector<lower=0>[N] E; // exposure
int<lower=1> K; // num covariates
matrix[N, K] x; // design matrix

real<lower=0> scaling_factor; // scales the variance of the spatial effects
}
transformed data {
vector[N] log_E = log(E);
}
parameters {
real beta0; // intercept
vector[K] betas; // covariates

real<lower=0> sigma; // overall standard deviation
real<lower=0, upper=1> rho; // proportion unstructured vs. spatially structured variance

vector[N] theta; // heterogeneous effects
vector[N] phi; // spatial effects
}
transformed parameters {
vector[N] convolved_re;
// variance of each component should be approximately equal to 1
convolved_re = sqrt(1 - rho) * theta + sqrt(rho / scaling_factor) * phi;
}
model {
y ~ poisson_log(log_E + beta0 + x * betas + convolved_re * sigma); // co-variates

// This is the prior for phi! (up to proportionality)
target += -0.5 * dot_self(phi[node1] - phi[node2]);

beta0 ~ normal(0.0, 1.0);
betas ~ normal(0.0, 1.0);
theta ~ normal(0.0, 1.0);
sigma ~ normal(0, 1.0);
rho ~ beta(0.5, 0.5);
// soft sum-to-zero constraint on phi)
sum(phi) ~ normal(0, 0.001 * N); // equivalent to mean(phi) ~ normal(0,0.001)
}
generated quantities {

vector[N] log_lik;
  for (n in 1:N) {
    log_lik[n] = poisson_log_lpmf(y[n] | log_E + beta0 + x[n] * betas + convolved_re * sigma);
  }
}
options(mc.cores = parallel::detectCores())
stan_fitw= stan(BYM2, data=dl, warmup=5000, iter=60000);
stan_fitw
print(stan_fitw, pars=c("beta0", "beta1", "rho", "sigma", "log_precision", "logit_rho", "mu[5]", "phi[5]", "theta[5]"), probs=c(0.025, 0.5, 0.975));
print(stan_fitw, pars=c("beta0", "beta1", "mu[5]", "phi[5]"), probs=c(0.025, 0.5, 0.975));
library("loo")

extract_log_lik(stan_fitw, parameter_name = "log_lik", merge_chains = TRUE)

edited by @jsocolar for syntax highlighting

@avehtari is it possible to do LOO with CAR-type models?

1 Like

Without a reproducible example, we can only speculate about exactly what is going wrong, but the first thing that I notice is that in

vector[N] log_lik;
  for (n in 1:N) {
    log_lik[n] = poisson_log_lpmf(y[n] | log_E + beta0 + x[n] * betas + convolved_re * sigma);
  }

you seem to be missing an index [n] on convolved_re. Perhaps this propagates forward to the warning that you see from the loo package.

However, @maxbiostat is rightly raising a different point, which is that you probably shouldn’t expect PSIS-LOO to work for your model. For clarity, we can distinguish three cases:

  • PSIS-LOO where the withhold consists of one row and there are many rows per spatial region in your model. This is likely to work, but the code you’ve provided indicates that you have just one observation per region, so this is not your use case.
  • PSIS-LOO where the withhold consists of an entire region, either because there is one data row per region or because out-of-sample prediction to new regions is of interest. It’s likely PSIS-LOO won’t perform well when the withhold removes all of the data informing a random effect parameter, and this is the situation that @maxbiostat is rightly thinking about.
  • Actual brute-force cross-validation, which will in general be valid even when the PSIS-LOO approximation fails. My guess is that this is what you will ultimately want to use here.

There’s a nice discussion of LOO in the context of spatial models here Spatial model selection with loo

dear @jsocolar I did what you said. but I did not get the result. I also checked the published link . i did not find my answer. after a little searching and thinking, I come to the conclusion that I should consider a specific parameter for its implementation, for example, beta1 and sigma … but I don’t know how to write the corresponding cod … i hop I hop I was able to get my point across …maybe I am wrong I do not know …

@cmcd , please look at this problem …

Yes, it is possible to use LOO with CAR

But, simple PSIS-LOO is likely to not perform well. It is possible to improve the performance by using quadrature to integrate out those random effects, as demonstrated in Roaches cross-validation demo (see section " Poisson model with “random effects” and integrated LOO")

In case of CAR, this has to be implemented by leaving the random effect for the left out parameter, as just removing it would change the CAR prior.

EDIT: fixed the sentence

1 Like

In this case, the observations are Possion distributed and so cannot be made parameters, but I think it should be possible to include the underlying random effects as parameters without changing the CAR prior and then just omit the data likelihood.

1 Like

Yes! Fixed my post, too (I guess I’m too tired to focus)

1 Like

i reach my goal. thanke you both of you. @jsocolar @avehtari

1 Like