I don’t think we support `y[subset_index==m,]`

yet - so that seems like it could help here, but it sounds like there’s still an issue even if we had that feature. Just to make sure I understand I’m going to try to flesh out Andrew’s example a little more. So in that example we’d also have

```
parameters {
real a[J];
}
```

Did you also mean to type

`int<lower=1, upper=J> patient_index[N];`

instead of

`int<lower=1, upper=J> patient_index[J];`

?

And J here is the number of patients, N is the number of data points, so we’re grouping data by patient.

So the issue comes when you want to operate over a subset of the patients. And right now you have to do this in R such that when you pass in all of the data there is now a new N and a new J, and that would work with your existing model but then the data mangling happens in R, is that right? What would it look like in Stan? To me it sounds like you’d just add a transformed data block that does subsetting and creates new N1 and J1 variables for the rest of the parameters. Something like the following if you pretend Stan has the vectorized equality mentioned previously:

```
data {
int M;
int<lower=1, upper=M> subset_index[J];
int m;
}
transformed data {
int J1 = length(subset_index==m);
int N1 = length(y[subset_index==m]);
}
parameters {
real a[J1];
real b;
}
```

And now if you pass in `M=1; subset_index = rep(1, J)`

you get the full model and otherwise you construct the subset_index as you indicated.

I think that’s the situation now modulo that `subset_index==m`

vectorized equality functionality that we’re missing. @andrewgelman what would you like on top of that in Stan syntax?