Difficulties in coding autoregressive/panel model for ragged array

I am having trouble coding an autoregressive panel (or times-series, cross-sectional) model where the main variable, theta, forms a ragged array, i.e, has different length time-series for each unit.

This is part of a larger model where theta is estimated via an IRT type measurement model. I have omitted those features from the below since that part of the model works fine. But the key point is that theta is estimated, not observed.

If I treat theta as a rectangular matrix, with the same length times-series for each unit, the model compiles and produces reasonable estimates:

data{
  int<lower=1> J;     // n countries
  int<lower=1> T;     // n years
  ...
}

parameters{
  real<lower=0> sigma_theta;	    
  matrix[T,J] theta_raw; 	        
  row_vector[J] theta_init;		
  ...
}

transformed parameters{
  matrix[T,J] theta; 	            
  theta[1] = theta_init; 
  for (t in 2:T) 	               
    theta[t] = theta[t-1] + sigma_theta * theta_raw[t-1];
  ...
}

model{				
  sigma_theta ~ normal(0, 2); 
  theta_init ~ normal(0, 1);
  for(t in 1:T) 
	theta_raw[t] ~ normal(0, 1);
  ...
}

I have coded the ragged array version of the model below, where theta is defined as a vector rather than a T by J matrix, and there are two vectors, pos1 and pos2, which indicate the start and end points of each unit’s time-series.

data{
  int<lower=1> J;     // n countries
  int<lower=1> R;     // n estimates
  int pos1[J];	
  int pos2[J];	
...
}

parameters{
  real<lower=0> sigma_theta;	    
  vector[R] theta_raw;	        
  vector[J] theta_init;		
  ...
}
  
transformed parameters{
  vector[R] theta;
  for (j in 1:J) {
	theta[ pos1[j] ] = theta_init[j];  
	theta[ (pos1[j] + 1) : pos2[j] ] = theta[ pos1[j] : (pos2[j] - 1) ] +
	sigma_theta * theta_raw[ pos1[j] : (pos2[j] - 1) ];
  }
  ...
}
  
model{
  sigma_theta ~ normal(0, 2); 
  theta_init ~ normal(0, 1);
  ...
 }

However, I get errors: theta appears to includes NAN’s. I believe my vectorised approach is correct (I have tested in using fake data in R) but perhaps my code is inappropriate for the transformed parameters block?

2 Likes

I think you need to build theta using an explicit loop. When runing

theta[ (pos1[j] + 1) : pos2[j] ] = theta[ pos1[j] : (pos2[j] - 1) ] +
	sigma_theta * theta_raw[ pos1[j] : (pos2[j] - 1) ];

Stan will first compute the whole right-hand side before assigning, so the theta elements beyond pos1[j] are still NaN at this point.

I think all should work if you do something like (code not tested) instead:

transformed parameters{
  vector[R] theta;
  for (j in 1:J) {
	theta[ pos1[j] ] = theta_init[j];  
    for(t in (pos1[j] + 1) : pos2[j] ) {
        theta[t] = theta[t-1] + sigma_theta * theta_raw[t-1];
    }
  }
  ...
}

Best of luck with your model!

1 Like

Thank you very much for your reply @martinmodrak. Yes, I stumbled upon your solution a few days ago, i.e., using an explicit loop so that each element of the entire theta vector is assigned in turn. It indeed does the job!

1 Like