Accessing element out of range warning when using subset of the whole data

Hi, I have a small question about how to specify parameters.
The model is

data {
  int<lower=1> N; // Number of subjects
  int<lower=1> obs; // Number of longitudinal measurements
  int subject[obs]; // subject ID
  vector[obs] Y;
  vector<lower=0>[obs] time;
  matrix[obs,3] NCS; // predictor matrix
}
transformed data {
  matrix[obs,3] Q_ast;
  matrix[3,3] R_ast;
  matrix[3,3] R_ast_inverse;
  // thin and scale the QR decomposition
  Q_ast = qr_thin_Q(NCS) * sqrt(obs - 1);
  R_ast = qr_thin_R(NCS) / sqrt(obs - 1);
  R_ast_inverse = inverse(R_ast);
}
parameters {
  real<lower=0> sigma_e;
  vector<lower=0>[2] sigma;
  vector[N] u_raw[2];
  vector[3] beta;
  real<lower=0> alpha;
}
transformed parameters {
  vector[N] u[2]; 
  vector[obs] theta;
  for (j in 1:2) {
    u[j] = sigma[j] * u_raw[j];
  }
  theta = alpha + Q_ast * beta + u[1][subject] + u[2][subject] .* time;
}
model {
  Y ~ normal(theta,sigma_e);
  for (j in 1:2) u_raw[j] ~ std_normal();   
  sigma_e ~ normal(0,10);
  sigma ~ normal(0,10);
  beta ~ normal(0,10);
  alpha ~ normal(0,10);
}

the warning is

Exception: vector[multi] indexing: accessing element out of range.  ... at line 32

I think the problem is the random effect u[subject].
The data is like

standata1 <- list(N=length(unique(dat$id)),obs=nrow(dat),subject=dat$id,Y=dat[,4],time=dat$month,NCS=NCS)

Because this is only a subset of the whole dataset. (Only the risk set)
For example, the N could be 15, which means there are 15 unique subjects in this dataset.
But the serial number of subjects could be 30 or 45, since there are 50 subjects in total in the whole dataset. (subject represents the serial number of subject of each observation)
So the term u[subject] is reporting errors: accessing element out of range. (My guess)
How can I specify u, making it corresponding with unique(dat$id)?

Hi,

before sending data to Stan, could you transform your subset of data to correspond to correct values?

For example your full data has subject_id --> 1, 2, ..., 40, 41 and then you take a subset 1, 23, 44.
You could order subset and then “normalize them” --> 1 --> 1, 23 --> 2, 44 --> 3.

I think this is doable with R.

Also going back to original index you just do the reverse 1 --> 1, 2 --> 23 etc.

1 Like

Hi, Ari. You’re right, it is easy to implement in R. Thanks for the hint.