Hello,
I’m new to Stan and I’m trying to fit a simple multinomial logistic model using rstan and I want to know if I have coded the model efficiently and/or that the performance I’m seeing is normal. I’m using a Windows machine and the Stan version is 2.21.0.
I’ve looked around the forums for a bit and I came across this post, and my problem is kind of similar in that the performance that I’m seeing is not that great and I’m not sure if there is something wrong with the way I coded the model up. I’ve also enabled parallel cores using:
options(mc.cores = parallel::detectCores())
and configured the C++ toolchain as suggested in the getting started page of RStan.
The data I’m trying to fit is fairly small, it’s a conjoint survey data of about ~340 units each doing 16 tasks of 5 choices each task. The choices include an outside option of all zeros and there are a total of 10 number of variables.
The data is organized as such that the rows corresponds to: the number of units x the number of tasks each unit does x the number of choices (n * t * p) and I’ve collapsed the number of units x the number of tasks to (n * t = N), and so the dimension for x is (N x na) where na is the number of variables.
The stan model code for it is given below:
model_code = "
data {
int p; //number of choice alternatives
int na; //number of alternative-specific vars
int N; //data length
int y[N]; //n x 1 multinomial outcomes
matrix[N * p, na] x; //dataset
}
parameters {
vector[na] beta;
}
model {
vector[N * p] x_beta = x * beta;
matrix[N, p] x_beta2;
x_beta2 = to_matrix(x_beta, N, p, 0); //convert to matrix N x p
beta ~ normal(0, 100); //specify prior
for (n in 1:N)
y[n] ~ categorical_logit(x_beta2[n]');
}
"
From what I’ve seen from the documentation, the categorical_logit has not been vectorized and I have to resort to the use of loops. I ran the model to draw 20,000 draws of the beta parameters and, I tried running the model a couple of times and I’m seeing that it takes me, on average, ~30 minutes to complete 20,000 draws and currently I’m just using 1 chain. I’ve tried using a more reasonable prior for logistic regression \mathcal{N}(0, 1), but I’m seeing similar performance (just very slightly better) after doing so.
Thanks,
Niko