How to write a right Multinomial distribution correctly?

Y.YAO · February 9, 2020, 6:54am

Hello all, this is the first time I using Stan, I want to ask some naïve questions about Multinomial distribution.

My data like this form

Ob1 Ob2 Ob3
16 17 7
10 20 10
…

which there are T times data set with three objects that each time’s total is 40.
Therefore I want to use multinomial distribution to estimate parameters.
My process model is
Y_t = U + B * Y_t-1 + W
where Y is time t’s data, which should be a 31 matrix mathematically.
And U is a 31 parameter vector, B is a 33 parameter matrix.
W is the 31 error term.

My stan code as follows

functions{

  matrix B_r(matrix B){
  matrix[3,3] B_r;  
  B_r[1,1]=B[1,1]; 
  B_r[1,2]=B[1,2]; 
  B_r[2,1]=B[2,1]; 
  B_r[2,2]=B[2,2];
  B_r[1,3]=0;
  B_r[2,3]=0;
  B_r[3,3]=0;
  B_r[3,2]=0;
  B_r[3,2]=0;
  return B_r;
  }
  
  vector U_r(matrix U){
    vector[3] U_r;
    U_r[1]=U[1,1];
    U_r[2]=U[2,1];
    U_r[3]=0;
    return U_r;
  }
   
}

data{
  int<lower=0> T;
  int<lower=0> Sp;
  int<lower=0> y[T,Sp];
  real min_log_y; 
  real max_log_y;
}

parameters{
  
  vector<lower = min_log_y, upper = max_log_y>[T] log_y[3];
  matrix<lower=0>[1,3] U;
  matrix[3,3] B;
  matrix[3,1] sigma;
}

transformed parameters{

  vector[3] P;
  P[1]=y[T,1];
  P[2]=y[T,2];
  P[3]=y[T,3];
  
}


model{
    // Data Model ===================================
    for (t in 1:T){
              y[t] ~ multinomial(P/40);
    }
    // End of Data Model ===============================
    
    // Process model============================
    
    for (t in 2:T){
      log_y[t,]~ multi_normal(U_r(U) + B_r(B)*log_y[t-1], sigma);
    }
    
    sigma[,1] ~ uniform(0,1000);
    
    U[,1] ~ normal(0,100);
    
    
    //End of process model
}


generated quantities{

  
  vector [2] U_op;
  matrix [2,2] B_op;
  matrix [2,1] sigma_op;
  
  U_op[1] = U_r(U)[1];
  U_op[2] = U_r(U)[2];
  B_op[1,1] = (1-B_r(B)[1,1])/U_op[1];
  B_op[1,2] = -B_r(B)[1,2]/U_op[1];
  B_op[2,2] = (1-B_r(B)[2,2])/U_op[2];
  B_op[2,1] = -B_r(B)[2,1]/U_op[2];
  
  sigma_op = sigma;
}

Then I use rstan to fit this model, the code i used is

fit <- stan(file = “0208.stan”, data = MB2_3, pars = c(“U”,“B”,“sigma”), warmup = warmup, iter = iter, chains = 3, thin = thin)

But there is a error occurred as follows

here are whatever error messages were returned [[1]] Stan model ‘0208’ does not contain samples.

I am sorry for my ignorance. If possible please give me some advice, thank you very much!

Max_Mantei · February 15, 2020, 9:28pm

Hi! Welcome to the Stan Forums! Sorry for the long wait to get this reply.

Can you please run with chains = 1 and paste the error message that you get here?

Uniform priors on standard deviation parameters is not recommended in Stan. You could start with exponential or half-Normal, half-Student-t (by declaring <lower=0> in parameters block and giving normal or t priors). Also, sigma should be a 3 \times 3 matrix when entering multi_normal(..., sigma). If you are not modelling correlations, i.e. when sigma is a diagonal matrix, you are better off modelling these as independent normals (in that case I guess you need to pre-compute B_r(B)*log_y[t-1]).

The prior on U is also very wide, especially since you are working on the log scale of the outcome data! Also you’d want to specify priors for B as well.

You should probably define log_y in the transformed data block, right?

I hope these thoughts were helpful. Please let me know how its going! :)

Cheers,
Max

Y.YAO · February 16, 2020, 3:37am

Thanks for your help!
I have found that there are many mistakes in this script and I am recoding a new one. This is a uncomplete version.

data{
    int<lower=0> y[18, 3];
  //above for data model
    matrix [18,2] log_y;
    row_vector[2] mu_u;
    vector<lower=0>[2] sigma_u;
  
  //above for process model
  
}

parameters{
  simplex[3] theta[18];
  //above for data model
  vector<lower = 0>[2] sigma;
  matrix<lower = 0>[1,2] U;
  matrix[2,2] B;
  
  //above for process model
  
}

transformed parameters{
  matrix [18,2]log_y;
  for (i in 1:18){
    for (j in 1:2){
      log_y[i,j]=log(theta[i,j]*40);
    }
  }
}


model{
   //data model
  for (i in 1:18) {
        y[i,] ~ multinomial(theta[i]);
    } 
  //end of data model

  //process model
  
  for (t in 2:18){
    log_y[t,]~ multi_normal( U[1,] + log_y[t-1,]*B ,diag_matrix(sigma));
    
  }
  
  U[1,]~multi_normal(mu_u,diag_matrix(sigma_u));
  B[1,]~multi_normal(mu_u,diag_matrix(sigma_u));
  B[2,]~multi_normal(mu_u,diag_matrix(sigma_u));

 
}

And I have some new questions.

Firstly, the matrix B follow a multi_normal distribution, but in stan I don’t find which kind of normal distribution a matrix could follow. Therefore, I make each row vector of matrix B follow multi_normal, will it work?

Moreover, I want to ask that if I use the log scale outcome of a parameter in first model as data in second model, I got the follow message:

DIAGNOSTIC(S) FROM PARSER: Info: Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable. If it does, you need to include a target += statement with the log absolute determinant of the Jacobian of the transform. Left-hand-side of sampling statement: log_y[t, :] ~ multi_normal(…)

what should I do? Does it mean that I need to write a new target+ under the sample statement, or rewrite a multi_normal_lpdf only. I tried to write both two way, but I got them illegal.

Thanks again for your help!

Max_Mantei · February 19, 2020, 10:54am

Hey!

There is no matrix normal distribution in Stan, although you could build it yourself. However, if you want to give the entries in B independent normal distributions, you could just write to_vector(B) ~ normal(...,...) as your prior. You gain nothing using something like B[1,] ~ multi_normal(mu_u,diag_matrix(sigma_u)), because the diagonal matrix implies they are the entries in B are not correlated. Using univariate normals thus would speed up and simplify your model.

You declared log_y as transformed parameter, but you really should declare it as transformed data. If you put a prior on a transformed parameter Stan thinks you might have a change of variable and need to apply a Jacobian correction. But in your case you just transform the data, so the warning should go away when log_y is declared in the transformed data block.

Cheers!
Max

Y.YAO · February 19, 2020, 12:38pm

Thank you Max!
Your advices are really helpful. And I have another question about valid simplex.

My data set y follows the multinomial distribution, but I just require two of three rows in it.
My final goal is that separate observation error in y, and estimates the parameters U ,B and process error sigma. So I want to have a feedback in my model.

My new code as follow,

data{
int<lower=0> y[3,18];
real log_max;
}

parameters{
vector<lower = 0>[2] sigma;
matrix<upper=log_max>[2,18]log_y;
vector<lower = 0>[2] U;
matrix[2,2] B;
}

transformed parameters{

simplex[3] theta[18];

for (i in 1:18){
theta[i,1]=exp(log_y[1,i])/exp(log_max);
theta[i,2]=exp(log_y[2,i])/exp(log_max);
theta[i,3]=(exp(log_max)-(exp(log_y[1,i])+exp(log_y[2,i])))/exp(log_max);
}

}

model{
for (i in 1:18) {
y[,i] ~ multinomial(theta[i]);
}

for(t in 1:18){
target+= multi_normal_lpdf(log_y[,t] |U+B*log_y[,t-1],diag_matrix(sigma));
}
//parameters
B[1,1]~normal(0,100);
B[2,2]~normal(0,100);
B[1,2]~normal(0,100);
B[2,1]~normal(0,100);

for (i in 1:2){
U[i]~normal(0,100);
}

for(i in 1:2){
sigma[i]~normal(0,10);
}
}

Mathematically, I think it will work. However, when I running it, this error occurred.

List item

Chain 1: Rejecting initial value: Chain 1: Error evaluating the log probability at the initial value. Chain 1: Exception: validate transformed params: theta[i_0__] is not a valid simplex. sum(theta[i_0__]) = 0.4409792075, but should be 1 (in ‘model5fc4cb65d0d_0219’ at line 15)

I think it was caused by sampling deviation, the sum of random sample could not be 1. I am not sure.
If possible, please tell me what should I do.

sincerely

Max_Mantei · February 19, 2020, 3:52pm

Hey! Did you already check the users manual? I’d suggest you start simple before adding the auto-regressive term to get a model with only intercepts working.

Topic		Replies	Views
Reparameterization of a multinomial distribution Modeling	2	1069	April 9, 2019
Multinomial distribution Modeling	3	829	August 12, 2020
Multinomial distribution for modelling Causal Models with all categorical variables Modeling techniques , fitting-issues	2	743	February 24, 2020
Multinomial parameter in stan Modeling specification	3	133	July 19, 2024
Specifying a Multinomial Logistic Regression with Unpooled Estimates Modeling pystan , specification , multinomial-response	3	1174	November 29, 2021

How to write a right Multinomial distribution correctly?

Related topics