I am having trouble coding an autoregressive panel (or times-series, cross-sectional) model where the main variable, theta, forms a ragged array, i.e, has different length time-series for each unit.
This is part of a larger model where theta is estimated via an IRT type measurement model. I have omitted those features from the below since that part of the model works fine. But the key point is that theta is estimated, not observed.
If I treat theta as a rectangular matrix, with the same length times-series for each unit, the model compiles and produces reasonable estimates:
data{
int<lower=1> J; // n countries
int<lower=1> T; // n years
...
}
parameters{
real<lower=0> sigma_theta;
matrix[T,J] theta_raw;
row_vector[J] theta_init;
...
}
transformed parameters{
matrix[T,J] theta;
theta[1] = theta_init;
for (t in 2:T)
theta[t] = theta[t-1] + sigma_theta * theta_raw[t-1];
...
}
model{
sigma_theta ~ normal(0, 2);
theta_init ~ normal(0, 1);
for(t in 1:T)
theta_raw[t] ~ normal(0, 1);
...
}
I have coded the ragged array version of the model below, where theta is defined as a vector rather than a T by J matrix, and there are two vectors, pos1 and pos2, which indicate the start and end points of each unit’s time-series.
data{
int<lower=1> J; // n countries
int<lower=1> R; // n estimates
int pos1[J];
int pos2[J];
...
}
parameters{
real<lower=0> sigma_theta;
vector[R] theta_raw;
vector[J] theta_init;
...
}
transformed parameters{
vector[R] theta;
for (j in 1:J) {
theta[ pos1[j] ] = theta_init[j];
theta[ (pos1[j] + 1) : pos2[j] ] = theta[ pos1[j] : (pos2[j] - 1) ] +
sigma_theta * theta_raw[ pos1[j] : (pos2[j] - 1) ];
}
...
}
model{
sigma_theta ~ normal(0, 2);
theta_init ~ normal(0, 1);
...
}
However, I get errors: theta appears to includes NAN’s. I believe my vectorised approach is correct (I have tested in using fake data in R) but perhaps my code is inappropriate for the transformed parameters block?