Difficulties in coding autoregressive/panel model for ragged array

nomad · April 25, 2021, 1:10pm

I am having trouble coding an autoregressive panel (or times-series, cross-sectional) model where the main variable, theta, forms a ragged array, i.e, has different length time-series for each unit.

This is part of a larger model where theta is estimated via an IRT type measurement model. I have omitted those features from the below since that part of the model works fine. But the key point is that theta is estimated, not observed.

If I treat theta as a rectangular matrix, with the same length times-series for each unit, the model compiles and produces reasonable estimates:

data{
  int<lower=1> J;     // n countries
  int<lower=1> T;     // n years
  ...
}

parameters{
  real<lower=0> sigma_theta;	    
  matrix[T,J] theta_raw; 	        
  row_vector[J] theta_init;		
  ...
}

transformed parameters{
  matrix[T,J] theta; 	            
  theta[1] = theta_init; 
  for (t in 2:T) 	               
    theta[t] = theta[t-1] + sigma_theta * theta_raw[t-1];
  ...
}

model{				
  sigma_theta ~ normal(0, 2); 
  theta_init ~ normal(0, 1);
  for(t in 1:T) 
	theta_raw[t] ~ normal(0, 1);
  ...
}

I have coded the ragged array version of the model below, where theta is defined as a vector rather than a T by J matrix, and there are two vectors, pos1 and pos2, which indicate the start and end points of each unit’s time-series.

data{
  int<lower=1> J;     // n countries
  int<lower=1> R;     // n estimates
  int pos1[J];	
  int pos2[J];	
...
}

parameters{
  real<lower=0> sigma_theta;	    
  vector[R] theta_raw;	        
  vector[J] theta_init;		
  ...
}
  
transformed parameters{
  vector[R] theta;
  for (j in 1:J) {
	theta[ pos1[j] ] = theta_init[j];  
	theta[ (pos1[j] + 1) : pos2[j] ] = theta[ pos1[j] : (pos2[j] - 1) ] +
	sigma_theta * theta_raw[ pos1[j] : (pos2[j] - 1) ];
  }
  ...
}
  
model{
  sigma_theta ~ normal(0, 2); 
  theta_init ~ normal(0, 1);
  ...
 }

However, I get errors: theta appears to includes NAN’s. I believe my vectorised approach is correct (I have tested in using fake data in R) but perhaps my code is inappropriate for the transformed parameters block?

martinmodrak · April 30, 2021, 9:49am

I think you need to build theta using an explicit loop. When runing

theta[ (pos1[j] + 1) : pos2[j] ] = theta[ pos1[j] : (pos2[j] - 1) ] +
	sigma_theta * theta_raw[ pos1[j] : (pos2[j] - 1) ];

Stan will first compute the whole right-hand side before assigning, so the theta elements beyond pos1[j] are still NaN at this point.

I think all should work if you do something like (code not tested) instead:

transformed parameters{
  vector[R] theta;
  for (j in 1:J) {
	theta[ pos1[j] ] = theta_init[j];  
    for(t in (pos1[j] + 1) : pos2[j] ) {
        theta[t] = theta[t-1] + sigma_theta * theta_raw[t-1];
    }
  }
  ...
}

Best of luck with your model!

nomad · April 30, 2021, 2:35pm

Thank you very much for your reply @martinmodrak. Yes, I stumbled upon your solution a few days ago, i.e., using an explicit loop so that each element of the entire theta vector is assigned in turn. It indeed does the job!

Topic		Replies	Views
Ragged arrays and time series operations Modeling	15	1847	February 23, 2018
Problem when converting to a ragged array model (unsure why - syntax? typo?) Modeling specification	6	977	June 14, 2017
Creating a linear predictor Modeling	5	467	April 23, 2019
Fixed effects spatial panel data models with time-varying spatial autocorrelation parameters Modeling rstan , techniques , fitting-issues	0	416	May 18, 2023
Irregularly spaced intervals in dynamic panel models Modeling techniques , specification , performance	3	685	April 10, 2019

Difficulties in coding autoregressive/panel model for ragged array

Related topics