I’m defining a really simple Bayesian model, with ikelihood function a Beta distribution parametrized in terms of mean \mu = a /(a+b) and size of sample s = a+ b, if a and b are a standard Beta distribution parameters.
I provide the data in R
p1 = c(23.5, 23.0, 23.1, 23.1, 23.5, 23.0, 21.8, 22.5, 23.7, 24.0)/100
p2 = c(NA, NA, NA, NA, NA, 16.7, 16.9, 18.2, 19.0, 18.9)/100
p3 = c(17.94, 18.51, 18.74, 18.33, 19.57, 17.31, 17.81, 19.08, 19.66,19.73)/100
p4 = c(18.0, 17.6, 21.0, 20.8, 21.2, 19.5, 20.8, 20.8, 24.9, NA)/100
TT = 10
N = 4
p = rbind(p1, p2, p3, p4)
I have N rows and T observations for each row. There are cells which have NA however I treat them in the rstan code hopefully correctly. So, my rstan code is the following
stan_model_code = "
data {
int<lower=1> TT;
int<lower=1> N;
matrix<lower=0,upper=1>[N, TT] p;
}
parameters {
matrix<lower=0,upper=1>[N, TT] mu_t;
matrix<lower=0>[N, TT] s;
}
transformed parameters {
matrix<lower=0>[N,TT] a_t;
matrix<lower=0>[N,TT] b_t;
for(n in 1:N){
for(t in 1:TT){
if(p[n,t]!=0){
a_t[n,t] = mu_t[n,t] * s[n,t];
b_t[n,t] = (1-mu_t[n,t]) * s[n,t];
}else{
a_t[n,t] = 0;
b_t[n,t] = 0;
}
}
}
}
model {
for(n in 1:N){
for(t in 1:TT){
if(p[n,t]!=0){
target += beta_lpdf(mu_t[n,t] | 1, 1);
target += beta_lpdf(p[n,t] | a_t[n,t],b_t[n,t]);
target += normal_lpdf(s[n,t] |0,1);
}
}
}
}
"
library(rstan)
p[is.na(p)] = 0
s[is.na(s)] = 0
stan_model <- stan_model(model_code = stan_model_code)
fit <- sampling(stan_model, data = list(TT = TT, N = N, p = p), iter = 5000,
chains = 2)
The model looks simple, however, when I run it, the results go crazy :D
Warning messages:
1: There were 5000 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
2: Examine the pairs() plot to diagnose sampling problems
3: The largest R-hat is NA, indicating chains have not mixed.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#r-hat
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess
The Rhat is NA, it is because of the IF statement that I use in the rstan code to treat the values p[n,t]==0
, i.e. to exclude cases which correspond to NA. Is it clear to anyone what this piece of code has such a weird behaviour?