# Arbitrary Number of Levels in Stan?

I’m working with hierarchical models, but I’m having trouble using a different number of levels. I can’t use a tool like rstanarm, otherwise that would make things a lot easier (FWIW I’m using cmdstanpy)

Suppose I’m fitting the following:

``````outcomes ~ (1 | group1) + (1 | group2)
``````

From what I understand this is roughly written in stan as:

``````// model_level_2.stan

data {
// ---- dimensions of things
int<lower=0> N;
int<lower=1> L_group1;
int<lower=1> L_group2;

// ---- data
int<lower=0> trials[N];        // initial trials
int<lower=0> outcomes[N];        // initial successes
int<lower=1, upper=L_group1> group1[N];
int<lower=1, upper=L_group2> group2[N];
}
parameters {
vector[L_group1] beta_group1;
vector[L_group2] beta_group2;
}
model {
// set your priors on beta_ ...

for (n in 1:N)
outcomes[i] ~ binomial_logit( trials[n],  beta_group1[ group1[n] ] + beta_group2[ group2[n] ])
}
``````

My question is: What if I wanted to run this on an arbitrary number of groups? For example, if I wanted to fit the following:

``````outcomes ~ (1 | group1) + (1 | group2) + (1 | group3)
``````

Would I have to write another stan file called `model_level_3.stan` where I’ve added terms for `group3` and so on? That seems like a bad way to do it. Does stan support something like this, or do people have an approach for it?

Apologies if my models are wrong / bad. I’m open to feedback on that, but I feel like my question still stands 😇

Here’s the same model but with any number of groups. The only tricky part is that if you can’t have an array of `beta_group` vectors (because the Stan arrays must have uniform size). So `beta_group` is concatenation of all the groups and `begin` is the index where that group starts.

``````data {
// ---- dimensions of things
int<lower=0> N;
int<lower=1> G; // number of groups
int<lower=1> L_group[G];

// ---- data
int<lower=0> trials[N];        // initial trials
int<lower=0> outcomes[N];        // initial successes
int<lower=1,upper=rep_array(L_group,N)> group[N,G];
}
transformed data {
int begin[G];
int B = 0;
for (g in 1:G) {
begin[g] = B;
B = B + L_group[g];
}
}
parameters {
vector[B] beta_group;
}
model {
// set your priors on beta_ ...

for (n in 1:N) {
vector[G] betas;
for (g in 1:G)
betas[g] = beta_group[begin[g] + group[n,g]];
outcomes[n] ~ binomial_logit( trials[n],  sum(betas) ) ;
}
}
``````
6 Likes

Thanks! This did the trick, and I figured out how to set priors. This seems similar to handling ragged data.

This is great and I think this should be in the manual as it’s a topic that comes up often enough.

3 Likes

I give that a 👍. This solution works great and makes sense, but it’s not at all obvious. I was surprised nothing was in the manual. I’m also happy to help/learn more!

I also thought “maybe it is in the manual, but I just don’t know the word for it.” I had a similar experience with ragged arrays. Something something discoverability 🤔 … thankfully the forums are great!

1 Like