Base type mismatch, what is wrong?

I am trying to run this stan file (without any input yet):

`data {
int<lower=0> N; // number of data points
int<lower=0> J; // number of random effects
int<lower=1,upper=J> groups[N]; //group assignment
matrix[N, J] X; // design matrix
real y[N];
}

parameters {
vector[J] a; //intercepts
real b; // random effects
real mu_a; // mean of random effects
real<lower=0,upper=100> sigma_a; //random effects sd
real<lower=0,upper=100> sigma_y; // population sd
}
transformed parameters {
vector[N] y_hat;
for (n in 1:N)
y_hat[n] = a[groups[n]] + X[n] * b;
}
model {
sigma_a ~ uniform(0, 100);
a ~ normal (mu_a, sigma_a);
b ~ normal (0, 1);
sigma_y ~ uniform(0, 100);
y ~ normal(y_hat, sigma_y);
}
`
I received the message:

SYNTAX ERROR, MESSAGE(S) FROM PARSER: base type mismatch in assignment; variable name = y_hat, type = real; right-hand side type=row vector error in 'model54c74905f38d_random_intercepts' at line 18, column 15

I tried declaring the y_hat variable with row vector[N] y_hat but same error came up. Why does it think y_hat is real when it is specified as vector[N]?

In this expression:

y_hat[n] = a[groups[n]] + X[n] * b;

y_hat is a vector, so y_hat[n] is a real.

The issue is that X[n] is a row_vector, b is a scalar, and a[groups[n]] is a scalar, so a[groups[n]] + X[n] * b all together comes out a row_vector, and you can’t assign that to a real (y_hat[n]).

I think you want to have b be a row_vector of length J (so J b coefficients).

vector[J] b;
1 Like

Thank you!!

I ran the model and got the following error message:

Initialization between (-2, 2) failed after 100 attempts. Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

How can I approach solving this issue? (Where can I specify initial values?)
As you can see, I am trying to fit a varying intercept model.
The input design matrix has rows representing a document and columns are word dummy variables.

I have tried several different models with different parameterization, but all ended up in the same error as above. I’m running out of ideas of how to “reparameterizing” the model. But maybe there’s something about my data that I’m not addressing.

Any suggestions is appreciated!

Can you post the new model and maybe a copy of the data that you’re passing in?

This sorta error happens when constraints are violated or there are uninitialized variables in the code or NaNs in the data (so the likelihood cannot be evaluated).

Really appreciate the help, Ben!
I made the change from above and fixed the outcome variable, so the previous model works.
Now I am trying to get this model working:

  data {
    int<lower=1> N; //number of data points, 692
    real y[N]; //y, 692
    int<lower=1> J; //number of words, 665
    matrix[N, J] X; //692 x 665
    int<lower=1, upper=J> groups[N]; //word id
  }

   parameters {
     real beta1; // fixed intercept
     vector[J] beta2; // fixed slope
     vector[J] u; //random intercepts
     real<lower=0> sigma_y; //population sd
     real<lower=0> sigma_a; //random effects sd
   }

   model {
     real mu;
     //real mu;
     //priors
     u ~ normal(0, sigma_a); //word random effects
     // likelihood
     for (n in 1:N){
       mu = beta1 + u[groups[n]] + X[n] * beta2; 
       y[n] ~ lognormal(mu, sigma_y);
   }
  }

Message:
Initialization between (-2, 2) failed after 100 attempts. Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

Any idea what is wrong?

Data

I have checked and there is no missing values in the input matrix and y.

Example:
Raw:

word    y 
 A     0.1
 A     0.2
 B     0.3
 C     0.5

Transform word into dummy variables

Input matrix (692 x 655)

     word1 word2  ... word655
row1   1      0           0
row2   1      0           0
.
.
row692 0      1           0

y = 692
J = groups (number of distinct words) = 655 (see input matrix)

Addition question-
In both models, how can I get errors for each word/random effect?
Right now I get x iteration values of sigma_a as well as sigma_y.
Is it possible to get 655 sigma_a for each word?

Hmm, you have the right constraints on sigma_a and sigma_y.

I’d guess, if anything, you should check and make sure all the y elements are greater than zero. That’s one of the requirements for a lognormal.

In both models, how can I get errors for each word/random effect? Right now I get x iteration values of sigma_a as well as sigma_y. Is it possible to get 655 sigma_a for each word?

Ooof, honestly, with 692 data points and 655 coefficients, you’re going to have a rough time with about anything. That’s not very much data.

What kind of model is this?

You’re right- I had some negative values in the y’s. Now that’s fixed.

I forgot to mention that 692 data points is a subset of my data. I tried a fixed effects model on the data using a few thousand data points, and it took about an hour. I wanted to shorten the run time so I’ve been using a smaller sample size.

My question was in the data model, the output of sigma_a is one number for each iteration. How can I make it into length J (number of words) vector?

one number for each iteration

You mean like J sigma_as instead of just one?

parameters {
  ...
  vector<lower=0.0>sigma_a[J] sigma_a;
}

Should do the trick as far as getting it programmed, but it doesn’t make sense to me.

So the mean of y comes from:

  1. beta1, a base level
  2. u, which is a different offset per group
  3. X * b, effect of other covariates

Your prior on u, normal(0, sigma_a) presumably means that you expect the u parameters to be at least in the same range as each other. Seems like a reasonable sort of regularizing assumption (you’d need to put a prior on sigma_a or fix it to actually make this happen though).

Given this, I’m not sure what giving each u its own standard deviation would get you.

Probly the bigger issue with this model is the lack of priors on sigma_a, sigma_y, beta1, and beta2. Are your estimates working out without them?