I have various time-series of different lengths (usually called panel data). I want to run AR(K=3) regression on each of them but some of them are too short for a meaningful regression so I decided to use a Bayesian Hierarchical model in which the 3 multipliers for each series are sampled from a common multivariate normal distribution, so far so good, the model fits just fine and there are no divergent transitions and all Rhat values are extremely close to 1.

My question is: my Stan code effectively doesnâ€™t model data (and parameters) for any time-series that has less than 4 observations, is that the right thing to do? I think I have no choice but to skip the series with just 1 observation but what about series with 2 or 3 observations?

Below is my Stan code:

```
data {
int<lower=1> N; // total number of observations
int<lower=1> K; // number of lagged factor
int<lower=1> numG; // number of groups
vector[N] y; // data for Hierarchical AR(K)
int sizes[numG]; // sizes of observations across groups
}
parameters {
real<lower = 0> local_sigma;
vector<lower = 0>[K] global_sigma;
vector[K] z[numG];
vector<lower = -2, upper = 2>[K] global_beta; // global AR(K) multipliers
}
transformed parameters {
vector[K] group_betas[numG]; // group level AR(K) multipliers
for(i in 1:numG) {
group_betas[i] = global_beta + global_sigma .* z[i]; // NCP formulation, no correlations
}
}
model {
int pos;
pos = 1;
for(i in 1:numG) { // loop over groups
int local_N;
vector[sizes[i]] local_y;
local_N = sizes[i];
local_y = segment(y, pos, local_N);
for (n in (K + 1):local_N) { // loop over observations
real mu;
mu = 0;
for (k in 1:K) { // loop over lags
mu = mu + group_betas[i][k] * local_y[n - k];
}
y[n] ~ normal(mu, local_sigma);
}
z[i] ~ normal(0, 1);
pos = pos + local_N;
}
// hyperpriors
local_sigma ~ cauchy(0, 2);
global_sigma ~ cauchy(0, 2);
global_beta ~ cauchy(0, 2);
}
```