Bernoulli_logit with different number of predictors

Dear Stan community,

I have a problem where I have missing data for one of the regressors. Hence, I fit the model by splitting the data into two parts, one with a design matrix with K predictors and a second one with K+1 predictors. I am trying to run the model but I am having problems combining the linear predictors and inputting into bernoulli_logit, see code below:

data {
  int<lower=0> N;   // number of data items
  int<lower=0> K;   // number of predictors
  matrix[N, K] X;   // predictor matrix
  int<lower=0> M;   // number of data items
  int<lower=0> L;   // number of data items
  matrix[M, L] Z;   // predictor matrix
  int<lower=0> S;   // number of predictors
  vector[S] y;      // outcome vector
  vector[L] mu;    // prior mean   
  vector<lower=0>[L] sig; // prior sd
}
parameters {
  vector[K] beta;       // coefficients for predictors
  real<lower=0> sigma;  // error scale
}
model {
  vector[S] ydata;
  ydata = append_row(X * beta[1:K], Z * beta);
  
  beta ~ normal(mu,sig);     // prior distribution of beta
  y ~ bernoulli_logit(ydata);  // likelihood
}

This yields the error message:

Error in stanc(file = file, model_code = model_code, model_name = model_name, :
0

Semantic error in ‘string’, line 22, column 2 to column 29:

Ill-typed arguments to ‘~’ statement. No distribution ‘bernoulli_logit’ was found with the correct signature.

Is there a way to plug this row bind into the likelihood?

The error is caused by y here. The bernoulli_logit distribution is only for discrete outcomes, so you need to declare y as an integer array, rather than a vector:

int y[S];      // outcome vector
2 Likes

Note that with the newest versions of Stan this will throw a deprecation warning and should switch to

array[S] int y;
2 Likes

Ah great point, thanks for catching!!