Hello Stan users and experts,
I have a vector that obeys a multinomial distribution as observation model, and I need part of this vector to go into the process model. Therefore, I want to create two vectors. One is the vector used in the process model ( log_A_3 in stan code), where each element in the vector is less than a constant (log_max in stan code) and the sum of the elements is less than log_max. The other is the vector used in the observation model ( log_A in stan code), where the sum of the elements is log_max (I know this can be done with simplex).
my script as follow
data {
int I; //3
int T; //18
int J; //43
int<lower=0> y[T, J, I];
real log_max; //log(1000)
//etc.....
}
parameters {
vector<upper = log_max>[I - 1] log_A[T, J];
//etc.....
}
transformed parameters {
vector<upper = log_max>[I] log_A_3[T, J];
simplex[I] theta[T, J];
for (t in 1:T) {
for (j in 1:J) {
log_A_3[t, j, 1] = log_A[t, j, 1];
log_A_3[t, j, 2] = log_A[t, j, 2];
log_A_3[t, j, 3] = log(exp(log_max) - exp(log_A_3[t, j, 1]) - exp(log_A_3[t, j, 2]));
}
}
for(t in 1:T) {
for (j in 1:J) {
theta[t, j] = softmax(log_A_3[t, j]); //thetaăŽĺź
}
}
//etc.....
}
model {
for (t in 1:T) {
for (j in 1:J) {
y[t, j, ] ~ multinomial(theta[t, j, ]);
}
}
for (t in 2:T) { // For t > 1
for (j in 1:J) {
target += multi_normal_lpdf(log_A[t, j] | U[j] + B[j] * log_A[t - 1, j], diag_matrix(sigma));
}
}
for (j in 1:J) { // For t = 1
target += uniform_lpdf(log_A[1, j] | -1E6, log(exp(log_max) / 2));
//etc.....
}
Now my problem is that the result has a lot of divergent transitions after warm up, I think the error is caused by a procedure that log_A_3[,3] is less than zero due to random sampling of log_A. Hence, I want to find a solution to make sure (log_A[,1]+log_A[,2])<log_max when log_A[,1]<log_max and log_A[,1]<log_max simultaneously.
Thank you in advance