Group Lag

Hi,
I am trying to fit a multilevel time series with lags. This is a toy dataset I created in order to ensure I am writing my model right.

bb <- data.frame(X = rnorm(10),s = rep(1:5,2), t = rep(1:2, each = 5))
dt <- list(N_Y = 1, N_T = 2, N_I = 5, Y1 = bb$X, Subj = bb$s)

My R model using dplyr is

library(dplyr)
bb %>% 
  group_by(s) %>% 
  mutate(b = lag(X))

# A tibble: 10 x 4
# Groups:   s [5]
        X     s     t      b
    <dbl> <int> <int>  <dbl>
 1  1.12      1     1 NA    
 2  0.121     2     1 NA    
 3  0.989     3     1 NA    
 4 -1.70      4     1 NA    
 5  1.71      5     1 NA    
 6 -1.86      1     2  1.12 
 7  0.179     2     2  0.121
 8 -1.19      3     2  0.989
 9 -0.211     4     2 -1.70 
10 -2.04      5     2  1.71

In Stan, my code is

data {
  //int N_Y;
  int N_I;
  int N_T;
  real Y1[N_I*N_T];
  int Subj[N_I*N_T]
}
transformed data{
  int N;
  N = N_I*N_T;
}

generated quantities {
  vector[N] YY;
  
  for(i in 2:N){
    YY[i] = Y1[Subj[i-1]];
  }
}

with the following results

        mean se_mean sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
YY[1]    NaN      NA NA    NA    NA    NA    NA    NA   NaN  NaN
YY[2]   1.12       0  0  1.12  1.12  1.12  1.12  1.12     1    1
YY[3]   0.12       0  0  0.12  0.12  0.12  0.12  0.12     1    1
YY[4]   0.99       0  0  0.99  0.99  0.99  0.99  0.99     1    1
YY[5]  -1.70       0  0 -1.70 -1.70 -1.70 -1.70 -1.70     1    1
YY[6]   1.71       0  0  1.71  1.71  1.71  1.71  1.71     1    1
YY[7]   1.12       0  0  1.12  1.12  1.12  1.12  1.12     1    1
YY[8]   0.12       0  0  0.12  0.12  0.12  0.12  0.12     1    1
YY[9]   0.99       0  0  0.99  0.99  0.99  0.99  0.99     1    1
YY[10] -1.70       0  0 -1.70 -1.70 -1.70 -1.70 -1.70     1    1
lp__    0.00     NaN  0  0.00  0.00  0.00  0.00  0.00   NaN  NaN

Thanks for your help

Hi Reg,

It’s not clear what your issue is, would you be able to clarify what’s going wrong here?

Hi andrjohns,

I am trying to model Y_{i,t} = \alpha_{i}* Y_{i,t-1}, with the lagged time points (t) nested within the individual i.

But what help are you trying to get? Is there part of the output that’s not what you’re expecting or are you unsure of how to write the model?

I’m unsure of how to write the Stan code. My Stan code doesn’t match the R code. The R code is the objective.

Righto, so the issue there is your indexing:

  for(i in 2:N)
    YY[i] = Y1[Subj[i-1]];

Because Subj is just the values 1 to 5 repeated twice, you’re selecting the first 5 values of Y1 twice, which is why your output isn’t lining up

Sorry to be a bother, but I am unsure of how to progress from here.
Even if I create a time variable t for the time points

for (i in 1:N) {
if (time[i]>1) {
YY[i] = Y1[Subj[i-1]];
}
}

I’m not sure of how to achieve an output similar to the R output.

The easiest way would be to pass Y1 as a matrix, rather than a vector, like:

  matrix[N_I, N_T] Y1;

Then your loop would be:

for (i in 1:N_1) {
  for(t in 1:N_t) {
    YY[i,t] = Y1[i,t]
  }
}
1 Like

Alright. I’ll do that. I really appreciate the help.
Thank you very much, andrjohns.