HB Logit is easy in stan. However, if we allow the **number of observations/choices** to vary between subjects, it is a little harder. If we also allow the **number of possible outcomes/choice alternatives** to vary between subjects and tasks, it gets even harder.

Data needs to be organized in a **long format**, and one needs to be good at indexing.

**Suspected issues**:

So I did that, and the model seems to work fine. However, it isn’t particularly fast.

I suspect that there are some **copies** made in memory instead of pointers, because I am using both `segment`

and `to_matrix`

. If I were to create Eigen / Arma code myself, I’d probably use some advanced constructors to still use a pointer.

I also think the recursive nature of the indexing with `pos`

and `rs`

would make it hard to **parallelize**. If I were to code from scratch, I’d probably generate a lookup matrix / lookup vectors first, and then I would be able to access the right portions of the long data in parallel (e.g., using openMP #pragma for).

So here is the model. I’d appreciate any ideas that help make it more efficient.

**Performance**: Using a synthetic example (nresp=100, nsc[]=30), 2000 draws take me 580 seconds. Using non-ragged data structure, it is down to 220 seconds. **2.6 times slower** to allow for ragged data. Ouch.

```
data {
//dimensions
int<lower=1> nresp; // # of respondents
int<lower=1> nvar; // # of covariates of alternatives
int<lower=0> nz; // # of respondent covariates, i.e. upper level
int<lower=1> ntot; // total # of obs
int<lower=1> ntotsc; // total # of scenarios
int nsc[nresp]; // # of scenarios for each resp
int<lower=1> nalts[ntotsc]; // number of alternatives in each scenario
//data
matrix[nz, nresp] Z; //'upper level' covariates
int<lower=1> Y[ntotsc]; // observed choices (long format, 1-d array)
vector[ntot] X; // choice alternative covariates (long format, vector)
}
parameters {
matrix[nvar, nresp] Beta;
matrix[nvar, nz] Theta;
corr_matrix[nvar] Omega;
vector<lower=0>[nvar] tau;
}
transformed parameters {
cov_matrix[nvar] Sigma = quad_form_diag(Omega, tau);
}
model {
int rs; //unique scenario counter, position in y array
int pos; //position in x vector
rs =1;
pos = 1;
//priors
to_vector(Theta) ~ normal(0, 15);
tau ~ cauchy(0, 2.5);
Omega ~ lkj_corr(2);
//likelihood
for (r in 1:nresp) {
Beta[,r] ~ multi_normal(Theta*Z[,r], Sigma);
//go through all scenarios of the r-th resp
for (s in 1:nsc[r]){
//actual ll
Y[rs] ~ categorical_logit(to_matrix(segment(X, pos, nvar*nalts[rs]),nvar,nalts[rs])'*Beta[,r]);
//update counter
pos=pos+nvar*nalts[rs];
rs=rs+1;
}
}
}
```