# Rstan - generate multinomial distributions (ragged vectors)

I’m a complete novice in Rstan, and trying to generate multiple vectors from multinomial distributions. Since this should create ragged vectors, which is not supported in Rstan to my best knowledge, I bind all of them into one long vector.

Here’s my current R code to run the stan file,

``````    n <- 50
numInst_train <- sample(10:20, size = n, replace = T)

rstanfit <- stan(file = "rcode/MultiBayes.v1.stan",
data = list(n = n,
numInst = numInst_train,
m = sum(numInst_train)),
chains = 1,
iter = 10,
init = list(chain1 = list(hp_pi = unlist(rep(1 / numInst_train, numInst_train)),
temp = rep(0, sum(numInst_train)))))
``````

and this below is my stan file;

``````    data {
// information of input
int n; // the number of samples
int numInst[n]; // the number of instances in a bag
int m; // Total number of instances
}
parameters{
vector[m] temp;
vector[m] hp_pi;
}
model{
int delta[m]; // indicator of primary instances
int pos = 1;
for(jj in 1:n){
segment(delta, pos, numInst[jj]) ~ multinomial(segment(hp_pi, pos, numInst[jj]));
pos += numInst[jj];
}
for(jj in 1:m){
temp[jj] ~ normal(delta[jj], 1);
}
}
``````

Therefore, I’ve got an error message that says

``````  Error evaluating the log probability at the initial value.
Exception: multinomial_lpmf: Number of trials variable is -2147483648, but must be >= 0!  (in 'model630cdd604c_MultiBayes' at line 15)
``````

I’m using R 3.5.0, Windows, and stan 2.17.0.

It would be appreciated if anyone gives me your input here.

You are using `delta` in the model block before its elements have been filled in.

Hi Ben Goodrich, I appreciate your reply! I’ve tried to declare `delta` in the parameter block, but it gives me an error since integer type variables cannot be defined in it. Therefore, if I declare `delta` in the parameter block and change it as vector type, then multinomial statement becomes not valid since `integer ~ multinomial()` is correct grammar. How do you think I can handle this?

Your thought process is not consistent with Stan’s language / algorithms. The reason why you are not allowed in the Stan language to declare an integer unknown in the `parameters` block is because the NUTS algorithm requires that the posterior kernel be differentiable with respect to all the unknowns. You cannot differentiate with respect to an integer. So, trying to evade that error message by defining it as a `vector` (of real numbers) or defining it as an integer in the `model` block is not going to overcome the fact that it is impossible for NUTS to draw from the posterior distribution you have in mind.

The actual solution is to marginalize out the discrete unknowns so that the posterior distribution NUTS is drawing from actually is differentiable with respect to the remaining parameters. Then, if you want, you can draw from the full conditional distribution of the discrete unknowns in the `generated quantities` block. There is a whole chapter on this in the manual.

1 Like

That was really insightful advice. Thanks a lot!