How to write a right Multinomial distribution correctly?

Hello all, this is the first time I using Stan, I want to ask some naïve questions about Multinomial distribution.

My data like this form

Ob1 Ob2 Ob3
16 17 7
10 20 10

which there are T times data set with three objects that each time’s total is 40.
Therefore I want to use multinomial distribution to estimate parameters.
My process model is
Y_t = U + B * Y_t-1 + W
where Y is time t’s data, which should be a 31 matrix mathematically.
And U is a 3
1 parameter vector, B is a 33 parameter matrix.
W is the 3
1 error term.

My stan code as follows

functions{

  matrix B_r(matrix B){
  matrix[3,3] B_r;  
  B_r[1,1]=B[1,1]; 
  B_r[1,2]=B[1,2]; 
  B_r[2,1]=B[2,1]; 
  B_r[2,2]=B[2,2];
  B_r[1,3]=0;
  B_r[2,3]=0;
  B_r[3,3]=0;
  B_r[3,2]=0;
  B_r[3,2]=0;
  return B_r;
  }
  
  vector U_r(matrix U){
    vector[3] U_r;
    U_r[1]=U[1,1];
    U_r[2]=U[2,1];
    U_r[3]=0;
    return U_r;
  }
   
}

data{
  int<lower=0> T;
  int<lower=0> Sp;
  int<lower=0> y[T,Sp];
  real min_log_y; 
  real max_log_y;
}

parameters{
  
  vector<lower = min_log_y, upper = max_log_y>[T] log_y[3];
  matrix<lower=0>[1,3] U;
  matrix[3,3] B;
  matrix[3,1] sigma;
}

transformed parameters{

  vector[3] P;
  P[1]=y[T,1];
  P[2]=y[T,2];
  P[3]=y[T,3];
  
}


model{
    // Data Model ===================================
    for (t in 1:T){
              y[t] ~ multinomial(P/40);
    }
    // End of Data Model ===============================
    
    // Process model============================
    
    for (t in 2:T){
      log_y[t,]~ multi_normal(U_r(U) + B_r(B)*log_y[t-1], sigma);
    }
    
    sigma[,1] ~ uniform(0,1000);
    
    U[,1] ~ normal(0,100);
    
    
    //End of process model
}


generated quantities{

  
  vector [2] U_op;
  matrix [2,2] B_op;
  matrix [2,1] sigma_op;
  
  U_op[1] = U_r(U)[1];
  U_op[2] = U_r(U)[2];
  B_op[1,1] = (1-B_r(B)[1,1])/U_op[1];
  B_op[1,2] = -B_r(B)[1,2]/U_op[1];
  B_op[2,2] = (1-B_r(B)[2,2])/U_op[2];
  B_op[2,1] = -B_r(B)[2,1]/U_op[2];
  
  sigma_op = sigma;
}

Then I use rstan to fit this model, the code i used is

fit <- stan(file = “0208.stan”, data = MB2_3, pars = c(“U”,“B”,“sigma”), warmup = warmup, iter = iter, chains = 3, thin = thin)

But there is a error occurred as follows

here are whatever error messages were returned [[1]] Stan model ‘0208’ does not contain samples.

I am sorry for my ignorance. If possible please give me some advice, thank you very much!

Hi! Welcome to the Stan Forums! Sorry for the long wait to get this reply.

Can you please run with chains = 1 and paste the error message that you get here?

Uniform priors on standard deviation parameters is not recommended in Stan. You could start with exponential or half-Normal, half-Student-t (by declaring <lower=0> in parameters block and giving normal or t priors). Also, sigma should be a 3 \times 3 matrix when entering multi_normal(..., sigma). If you are not modelling correlations, i.e. when sigma is a diagonal matrix, you are better off modelling these as independent normals (in that case I guess you need to pre-compute B_r(B)*log_y[t-1]).

The prior on U is also very wide, especially since you are working on the log scale of the outcome data! Also you’d want to specify priors for B as well.

You should probably define log_y in the transformed data block, right?

I hope these thoughts were helpful. Please let me know how its going! :)

Cheers,
Max

Thanks for your help!
I have found that there are many mistakes in this script and I am recoding a new one. This is a uncomplete version.

data{
    int<lower=0> y[18, 3];
  //above for data model
    matrix [18,2] log_y;
    row_vector[2] mu_u;
    vector<lower=0>[2] sigma_u;
  
  //above for process model
  
}

parameters{
  simplex[3] theta[18];
  //above for data model
  vector<lower = 0>[2] sigma;
  matrix<lower = 0>[1,2] U;
  matrix[2,2] B;
  
  //above for process model
  
}

transformed parameters{
  matrix [18,2]log_y;
  for (i in 1:18){
    for (j in 1:2){
      log_y[i,j]=log(theta[i,j]*40);
    }
  }
}


model{
   //data model
  for (i in 1:18) {
        y[i,] ~ multinomial(theta[i]);
    } 
  //end of data model

  //process model
  
  for (t in 2:18){
    log_y[t,]~ multi_normal( U[1,] + log_y[t-1,]*B ,diag_matrix(sigma));
    
  }
  
  U[1,]~multi_normal(mu_u,diag_matrix(sigma_u));
  B[1,]~multi_normal(mu_u,diag_matrix(sigma_u));
  B[2,]~multi_normal(mu_u,diag_matrix(sigma_u));

 
}

And I have some new questions.

Firstly, the matrix B follow a multi_normal distribution, but in stan I don’t find which kind of normal distribution a matrix could follow. Therefore, I make each row vector of matrix B follow multi_normal, will it work?

Moreover, I want to ask that if I use the log scale outcome of a parameter in first model as data in second model, I got the follow message:

  • DIAGNOSTIC(S) FROM PARSER: Info: Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable. If it does, you need to include a target += statement with the log absolute determinant of the Jacobian of the transform. Left-hand-side of sampling statement: log_y[t, :] ~ multi_normal(…)

what should I do? Does it mean that I need to write a new target+ under the sample statement, or rewrite a multi_normal_lpdf only. I tried to write both two way, but I got them illegal.

Thanks again for your help!

Hey!

There is no matrix normal distribution in Stan, although you could build it yourself. However, if you want to give the entries in B independent normal distributions, you could just write to_vector(B) ~ normal(...,...) as your prior. You gain nothing using something like B[1,] ~ multi_normal(mu_u,diag_matrix(sigma_u)), because the diagonal matrix implies they are the entries in B are not correlated. Using univariate normals thus would speed up and simplify your model.

You declared log_y as transformed parameter, but you really should declare it as transformed data. If you put a prior on a transformed parameter Stan thinks you might have a change of variable and need to apply a Jacobian correction. But in your case you just transform the data, so the warning should go away when log_y is declared in the transformed data block.

Cheers!
Max

Thank you Max!
Your advices are really helpful. And I have another question about valid simplex.

My data set y follows the multinomial distribution, but I just require two of three rows in it.
My final goal is that separate observation error in y, and estimates the parameters U ,B and process error sigma. So I want to have a feedback in my model.

My new code as follow,

data{
int<lower=0> y[3,18];
real log_max;
}

parameters{
vector<lower = 0>[2] sigma;
matrix<upper=log_max>[2,18]log_y;
vector<lower = 0>[2] U;
matrix[2,2] B;
}

transformed parameters{

simplex[3] theta[18];

for (i in 1:18){
theta[i,1]=exp(log_y[1,i])/exp(log_max);
theta[i,2]=exp(log_y[2,i])/exp(log_max);
theta[i,3]=(exp(log_max)-(exp(log_y[1,i])+exp(log_y[2,i])))/exp(log_max);
}

}

model{
for (i in 1:18) {
y[,i] ~ multinomial(theta[i]);
}

for(t in 1:18){
target+= multi_normal_lpdf(log_y[,t] |U+B*log_y[,t-1],diag_matrix(sigma));
}
//parameters
B[1,1]~normal(0,100);
B[2,2]~normal(0,100);
B[1,2]~normal(0,100);
B[2,1]~normal(0,100);

for (i in 1:2){
U[i]~normal(0,100);
}

for(i in 1:2){
sigma[i]~normal(0,10);
}
}

Mathematically, I think it will work. However, when I running it, this error occurred.

  • List item

Chain 1: Rejecting initial value: Chain 1: Error evaluating the log probability at the initial value. Chain 1: Exception: validate transformed params: theta[i_0__] is not a valid simplex. sum(theta[i_0__]) = 0.4409792075, but should be 1 (in ‘model5fc4cb65d0d_0219’ at line 15)

I think it was caused by sampling deviation, the sum of random sample could not be 1. I am not sure.
If possible, please tell me what should I do.

sincerely

Hey! Did you already check the users manual? I’d suggest you start simple before adding the auto-regressive term to get a model with only intercepts working.