Hello Stan users and experts,
I have got a lot of helpful advices from Stan community before, and finished my model roughly. But when I fit the model using rstan, The following warning appears:
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: validate transformed params: proc[i_0__][i_1__][2] is 12.2138, but must be less than or equal to 6.90776 (in ‘model30047c5d3740_1021’ at line 31)
my model is as follow, and the lines regarding this warning are marked, and my questions are below the model.
data{
int I;
int T;
int J;
int y[T,J,I];
real log_max;
}
parameters{
real mu_U;
real <lower=0> tau_U;
vector[I-1] eta_U[J];
real mu_intra;
real <lower=0> tau_intra;
vector[I-1] eta_intra[J];
real mu_inter;
real <lower=0> tau_inter;
vector[I-1] eta_inter[J];
vector<upper=log_max>[I-1]log_A[T,J];
vector<lower = 0>[I - 1] sigma[J];
}
transformed parameters{
//set up parameters
vector[I - 1] U[J];
vector[I - 1] inter[J];
vector[I - 1] intra[J];
vector<upper=log_max>[I]log_A_3[T,J];
vector<lower=0,upper=1>[I]theta[T,J];
vector<upper=log_max>[I-1]proc[T-1,J]; //**mean of the process model.**
matrix[2, 2] B[J];
real <upper=log_max> summ[T,J];
//for dynamics parameters
for (j in 1:J){
U[j]=mu_U+tau_U*eta_U[j];
inter[j]=mu_inter+tau_inter*eta_inter[j];
intra[j]=mu_intra+tau_intra*eta_intra[j];
B[j, 1, 1] = intra[j, 1];
B[j, 2, 2] = intra[j, 2];
B[j, 1, 2] = inter[j, 1];
B[j, 2, 1] = inter[j, 2];
}
//process term
for (t in 2:T){
for (j in 1:J){
proc[t-1,j]=U[j]+B[j]*log_A[t-1,j];//**equation of the mean**
}
}
//for coverage
for (t in 1:T){
for (j in 1:J){
summ[t,j]=log_sum_exp(log_A[t,j]);
log_A_3[t,j,1]=log_A[t,j,1];
log_A_3[t,j,2]=log_A[t,j,2];
log_A_3[t,j,3]=log_diff_exp(log_max,summ[t,j]);
theta[t,j]=softmax(log_A_3[t,j]);
}
}
}
model{
//piror
mu_U~normal(0,5);
mu_inter~normal(0,5);
mu_intra~normal(0,5);
tau_U~normal(0,5);
tau_intra~normal(0,5);
tau_inter~normal(0,5);
for (j in 1:J){
eta_U[j]~normal(0,5);
eta_intra[j]~normal(0,5);
eta_inter[j]~normal(0,5);
sigma[j,1] ~ cauchy(0, 5);
sigma[j,2] ~ cauchy(0, 5);
}
//process model
//**process model regarding proc**
for (t in 2:T){
for (j in 1:J){
target += multi_normal_lpdf(log_A[t, j] |proc[t-1,j], diag_matrix(sigma[j]));
}
}
// For t = 1
for (j in 1:J) {
target += uniform_lpdf(log_A[1, j] | 0, log_max);
}
//observational model
for (t in 1:T) {
for (j in 1:J) {
y[t, j,] ~ multinomial(theta[t, j,]);
}
}
}
The first question is:
What led to this warning, and Stan still warmup and sampling after this warning.
I imitate the 8-schools model to write my model. Especially this part
U[j]=mu_U+tau_U*eta_U[j]; inter[j]=mu_inter+tau_inter*eta_inter[j]; intra[j]=mu_intra+tau_intra*eta_intra[j];
And I think I have given a reasonable bound for each parameter (Especially for log_A and other transformed parameters that related to it), but I do not know whether these bound are causes of the warning. And I have no confidence that I have given the right prior.
And the second question:
Before this model, I wrote another model that using non-centered parameterization form. Although that model can be fitted, There were 151 divergent transitions after warmup (iter=1000). I would like to ask that did this proportion of divergent transitions indicate that results are completely unreliable, or roughly usable.
Because of divergent transitions of last model, I write the above model recently, but The results seem even worse.
I hope someone can help me, and thank you in advance.
YAO