Hello,
I was unable to compile the example code in section 8.2 “Ragged Data Structures” of the Stan User’s Guide (version 2.28). Link here: https://mc-stan.org/docs/2_28/stan-users-guide/ragged-data-structs.html#ragged-data-structs.section
The guide suggests the following (I’ve added a missing curly brace at the end of the model section):
Original Documentation (plus final curly brace)
data {
int<lower=0> N; // # observations
int<lower=0> K; // # of groups
vector[N] y; // observations
array[K] int s; // group sizes
// ...
}
model {
int pos;
pos = 1;
for (k in 1:K) {
segment(y, pos, s[k]) ~ normal(mu[k], sigma);
pos = pos + s[k];
}
}
However, this does not compile in my version of rstan (2.21.2). I am running R version 3.6.3 (2020-02-29) on gnu/linux. Instead, the compiler flags the array[K] int s; declaration:
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
error in 'model125e6c2f14fb_ragged_group_sizes' at line 5, column 2
-------------------------------------------------
3: int<lower=0> K; // # of groups
4: vector[N] y; // observations
5: array[K] int s; // group sizes
^
6: // ...
-------------------------------------------------
PARSER EXPECTED: <one of the following:
a variable declaration, beginning with type,
(int, real, vector, row_vector, matrix, unit_vector,
simplex, ordered, positive_ordered,
corr_matrix, cov_matrix,
cholesky_corr, cholesky_cov
or '}' to close variable declarations>
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model 'ragged_group_sizes' due to the above error.
Looking through the forums, I didn’t see any posts directly addressing this issue, however, I saw that declaring s as int s[K] rather than an as an array worked for others, as follows:
Revised to Remove Array Declaration
data {
int<lower=0> N; // # observations
int<lower=0> K; // # of groups
vector[N] y; // observations
int s[K]; // group sizes
// ...
}
model {
int pos;
pos = 1;
for (k in 1:K) {
segment(y, pos, s[k]) ~ normal(mu[k], sigma);
pos = pos + s[k];
}
}
However, this model still doesn’t compile. Experienced stan model coders probably recognize that mu and sigma need to be declared, and could be declared as variables in the model block or as parameters, but for stan neophytes like myself, including explicit declarations is helpful, even if the code is no longer a minimal example:
**Revised to Include Parameters and Priors **
data {
int<lower=0> N; // # observations
int<lower=0> K; // # of groups
vector[N] y; // observations
//array[K] int s; // group sizes
int s[K];
// ...
}
parameters {
vector[K] mu;
vector<lower=0>[K] sigma;
}
model {
int pos;
pos = 1;
for (k in 1:K) {
segment(y, pos, s[k]) ~ normal(mu[k], sigma[k]);
pos = pos + s[k];
}
sigma ~ exponential(1);
mu ~ normal(0, 10);
}
Is my version of rstan and stan just too old? Is the array[K] int s going to work in future versions of stan? Is there something else I am missing?