# Error in compiling partially-hierarchical multinomial logistic regression

Hi all, I’m having a persistent error when trying to specify this model, and the line-numbers aren’t very helpful. Would anyone be willing to point me towards a solution?

Here’s my code:

``````data {
int<lower=1> N;  // number of tableaux -- that is to say, number of datapoints
int<lower=0> max_num_candidates;
array[N] int<lower=2, upper=max_num_candidates> J; // vector of num_candidate in each tableau
int<lower=1> K; //  number of constraints; this is constant across tableaux
array[K] int<lower=0, upper=1> is_constraint_hierarhical;
int<lower=1> L; // number of levels of the random effect
array[N] int<lower=1, upper=L> ll; // an array telling you which level of the random effect a given UR belongs to.
array[N] int<lower=1,upper=max_num_candidates> Y; // index of winning candidate in each tableau
array[N] matrix[max_num_candidates, K] violations; // violations
} parameters {
array[K] real mu; // the vector of constraint weights, coefficients - group level
array[K] real<lower=0> sigma; // the variance on these guys
array[L] vector[K] beta; // a vector of random-effect-level coefficients that are drawn from real_mu, one for each leve lof the random effect. these are the random-effect-level offsets // means drawn from the group-level real mu
//matrix[L, K] beta;
}

model {
for (constraint_index in 1:K) { // for each constraint -
mu[constraint_index] ~ normal(0, 100); // draw its mean from a hyperprior distribution
if (is_constraint_hierarhical[constraint_index] == 1) { // then do the thing where you draw a group-level random offset and put it in the relevant slot in the item-by-constraint coeff matrix
for (l in 1:L) { // then for each level of the random effect
beta[l, constraint_index] ~ normal(mu[constraint_index], sigma[constraint_index]); // draw the value of the level-specific offset for that constraint from a distribution centered on for each offset from the group-level for that constraint
}
}
else { // otherwise just skip one level of hierarchy and put the mu - the group-level - in.
for (l in 1:L) { // then for each level of the random effect
beta[l, constraint_index] =  mu[constraint_index]; // just slot in the group-level mu.
}
}

}
//print(size(beta));
for (i in 1:N) { // then for each datapoint
//relevant_intercept_index = ll[n];
Y[i] ~ categorical_logit(block(violations[i], 1, 1, J[i], K)* beta[ll[i]]); // take the datapoint n, get the random effect group it belongs to - ll - then index into the matrix of betas to find that row, and multiply it by the constraint violation matrix - this is what needs to be blocked on my end - for that one.
}
}

generated quantities {
array[N] int sim_winners;
for (i in 1:N){
sim_winners[i] = categorical_logit_rng(block(violations[i], 1, 1, J[i], K)*beta[ll[i]]);
}
}

``````

And I’m getting the error:

``````Semantic error in 'string', line 27, column 8 to column 58:

Cannot assign to global variable 'beta' declared in previous blocks.
``````

The model compiles and fits fine when K is non-hierarchically-structured:

``````data {
int<lower=1> N;  // number of tableaux -- that is to say, number of datapoints
int<lower=0> max_num_candidates;
array[N] int<lower=2, upper=max_num_candidates> J; // vector of num_candidate in each tableau
int<lower=1> K; //  number of constraints; this is constant across tableaux
array[N] int<lower=1,upper=max_num_candidates> Y; // index of winning candidate in each tableau
array[N] matrix[max_num_candidates, K] violations; // violations
}

parameters {
vector<lower=0>[K] beta;  // attribute effects  // constraint weights; should be just one
}

model {
beta ~ normal(0,1);
for (i in 1:N) // look up "block" - just looks at a specific part of a matrix
//local_violations = ; // this is supposed to get the i'th tableau's candidates and violations from the stack violations (which is num_tableaux * max_num_candidates (padded out) * num_constraints in size), and pull out only the first J[i] rows, which are just those that are nonzero and not padded out so relevant here, and then all columns (the K constraints, which are constant for all tableaux)
Y[i] ~ categorical_logit(block(violations[i], 1, 1, J[i], K)*beta); // so this gets just one instance, one token, at a time. could we ramp up to do multinomial and not categorical?
}

generated quantities {
array[N] int sim_winners;
for (i in 1:N)
sim_winners[i] = categorical_logit_rng(block(violations[i], 1, 1, J[i], K)*beta);
}

``````

So I assume something odd is going on in the way I tried to enrich it with hierarchical structure.

Thanks very much, I appreciate any help I can get in advance!

EDIT: I realize the error I originally had was to do with an extra parenthesis (deep sigh…), but after that I solved a few more random issues, and am still stuck on the one above; I think it has to do with mixing `~` and `=` when populating the `beta` matrix, but I don’t really know another way around it. any suggestions would be very welcome!

EDIT: I found I could get rid of the error by replacing the `beta[l, constraint_index] = mu[constraint_index]` with something hacky like `beta[l, constraint_index] ~ normal(mu[constraint_index], 0.00000001)` but that doesn’t really feel like a good solution; so I’d still really appreciate any help with this folks might have!

The key thing here is that you shouldn’t be trying to assign values to a parameter (i.e., `beta[l, constraint_index] = mu[constraint_index]`), instead you need to sample only the values of `beta` that need to be parameters, and then construct the `beta` matrix/array in the `transformed parameters` or `model` blocks from the `beta` parameters and `mu` parameters.

Thanks so much for your response; I’m not exactly sure how to do this, would you mind giving an example? One thought I have might be to do something like this:

``````model {
for (constraint_index in 1:K) {
if (is_constraint_hierarhical[constraint_index] == 1) {
mu[constraint_index] ~ normal(0, 100);

for (l in 1:L) {

beta[l, constraint_index] ~ normal(mu[constraint_index], sigma[constraint_index]);
}
}
else {
for (l in 1:L) {
beta[l, constraint_index] ~ normal(0, 100);
}
}

}
for (i in 1:N) { // then for each datapoint
Y[i] ~ categorical_logit(block(violations[i], 1, 1, J[i], K)* beta[ll[i]]);
}
}

``````

But wouldn’t this miss out on the intended structure that for a given constraint k, you want to either sample the mu with noise or without, depending on the item, rather than having the mu without noise be unrelated to the one with noise?

Thanks again for your help, I really appreciate it.

The end goal is that you only want a subset of `beta` to be parameters that are distributed `normal(mu[constraint_index], sigma[constraint_index])`, with the rest of the values being `mu[constraint_index]`, if I’m reading your model right.

So you only declare the subset of values of `beta` that need to be parameters as `parameters`, then you combine those parameters with the appropriate `mu` values to construct your full `beta`.

To use a simpler example, the below model has a single vector `beta` of length `K`:

``````data {
int K;
array[K] int hier_constraint;
}

transformed data {
int num_hier = sum(hier_constraint);
}

parameters {
vector[num_hier] beta_params; // These are given the appropriate normal(mu[k], sigma[k]) priors
vector[K] mu;
vector<lower=0>[K] sigma;
}

transformed parameters {
vector[K] beta;
{
// local variable to index which parameter value to use
int param_ind = 1;
for (k in 1:K) {
if (hier_constraint[k] == 1) {
beta[k] = beta_params[param_ind];
param_ind += 1;
} else {
beta[k] = mu[k];
}
}
}
}
``````
1 Like

Hi @andrjohns, thanks a lot for this; I really appreciate your help, and think I understand most of what you suggested. I implemented it, trying to interleave it with the multiple levels of hierarchy that I want, and it went fine, but I don’t think it implements anywhere your comment that the beta_params have have a mean drawn from normal(mu[k], sigma[k]). I tried to do this in the third line of the model block,
`beta_params ~ normal(mu,sigma); `, but this gives the error `Semantic error in 'string', line 42, column 0 to column 31: Ill-typed arguments to '~' statement. No distribution 'normal' was found with the correct signature.` which I don’t really understand; isn’t this where I specify the priors on the beta_params? My model is syntactically correct if I comment out the line third line of the model block (starting `beta_params`… etc.) but I don’t think this model does what I intend.

The model that throws the error above is:

``````data {
int<lower=1> N;  // number of tableaux -- that is to say, number of datapoints
int<lower=0> max_num_candidates;
array[N] int<lower=2, upper=max_num_candidates> J; // vector of num_candidate in each tableau
int<lower=1> K; //  number of constraints; this is constant across tableaux
array[K] int<lower=0, upper=1> is_constraint_hierarhical_by_items;
int<lower=1> L; // number of levels of the random effect
array[N] int<lower=1, upper=L> ll; // an array telling you which level of the random effect a given UR belongs to.
array[N] int<lower=1,upper=max_num_candidates> Y; // index of winning candidate in each tableau
array[N] matrix[max_num_candidates, K] violations; // violations
}

transformed data {
int num_constraints_hierarchical_on_items = sum(is_constraint_hierarhical_by_items);
}

parameters {
array[K] real mu; // the vector of constraint weights, coefficients - group level
array[K] real<lower=0> sigma; // the variance on these guys
array[L] vector[num_constraints_hierarchical_on_items] beta_params; // a vector of random-effect-level coefficients that are drawn from real_mu, one for each leve lof the random effect, for only those constraints that are hierarchical. these are the random-effect-level offsets // means drawn from the group-level real mu
//matrix[L, K] beta;
}

transformed parameters {
array[L] vector[K] beta;
{
// local variable to index which parameter value to use
int param_ind = 1;
for (constraint_index in 1:K) {
if (is_constraint_hierarhical_by_items[constraint_index] == 1) {
for (level_of_item_hierarchical_structure in 1:L) {

beta[level_of_item_hierarchical_structure,constraint_index] = beta_params[level_of_item_hierarchical_structure,param_ind];
param_ind += 1;
}} else {
for (level_of_item_hierarchical_structure in 1:L) {

beta[level_of_item_hierarchical_structure,constraint_index] = mu[constraint_index];
}
}
}
}
}

model {
mu ~ normal(0,10); // prior on mean of each mu
sigma ~ normal(0,1); // prior on sigma for that beta_param
beta_params ~ normal(mu,1); // this is the variance on how much individual random offsets can be from the mu
//print(size(beta));
for (i in 1:N) { // then for each datapoint
Y[i] ~ categorical_logit(block(violations[i], 1, 1, J[i], K)* beta[ll[i]]); // take the datapoint n, get the random effect group it belongs to - ll - then index into the matrix of betas to find that row, and multiply it by the constraint violation matrix - this is what needs to be blocked on my end - for that one.
}
}

generated quantities {
array[N] int sim_winners;
for (i in 1:N){
sim_winners[i] = categorical_logit_rng(block(violations[i], 1, 1, J[i], K)*beta[ll[i]]);
}
}

``````

would you mind pointing out where I’ve gone wrong here?

Looking at the parameters/statements of interest here, you’ve got:

``````parameters {
array[K] real mu;
array[L] vector[num_constraints_hierarchical_on_items] beta_params;
}

model {
mu ~ normal(0,10);
beta_params ~ normal(mu,1);
}
``````

Your parser error is happening because you’re using an `array real` (mu) as the location parameter for an `array vector` (beta_params) which isn’t valid. But that’s not really the main issue here.

The key component that you’re missing is something that tells Stan which element of `mu` should be used as the location parameter for which element of a given `beta_params[l]` - as not all elements of `mu` are used for the prior (only the ones corresponding to the constraint).

This means that you need to update the sampling statement for `beta_params` to only be using the values of `mu` which correspond to a hierarchical constraint.

One approach for this would be:

``````model {
mu ~ normal(0,10); // prior on mean of each mu
sigma ~ normal(0,1); // prior on sigma for that beta_param

for (l in 1:L) {
// local variable to index which parameter value to use
int param_ind = 1;
for (k in 1:K) {
// Extract the appropriate value of mu for specifying the
// prior for beta_params elements
if (is_constraint_hierarhical_by_items[k] == 1) {
beta_params[l, param_ind] ~ normal(mu[k], 1);
param_ind += 1;
}
}
}
}
``````

EDIT: Updated the placement of `param_ind`

1 Like

Hi @andrjohns , I realized what went wrong, and fixed it. Thank you so much for your help with this!