When modeling, for example, random intercepts in a regression, I follow the standard of indexing a vector the length of the number of groups in the variable:
data {
int<lower = 0> N;
int<lower = 0> N_grps;
int<lower = 1, upper = N_grps> grp [N];
}
parameters {
vector [N_grps] mu_grp;
}
Sometimes I test my code on subsets of the data without rows from every member of the group. I like to avoid re-indexing the group for simplicity, so instead of declaring a smaller N_grps
, I keep it the same and end up having elements of the vector mu_grp
that are never referred to in the likelihood and therefore (I think) never actually “fit,” beyond their prior.
This doesn’t seem to cause much of a computational difference for modest N
and N_grp
, but I’m wondering if I should expect worse performance for bigger datasets or, more broadly speaking, if I should avoid including unused parameters for other reasons.