Irregularly spaced intervals in dynamic panel models

Hia,

I’ve been playing with @rtrangucci’s model from: Dynamic panel data models with Stan? but have not been able to apply it to a dataset with irregular intervals.

My initial attempt at specifying latent parameters between observations fails spectacularly as the number of missing observations increases. Does anyone have a better suggestion?

Much appreciated,
Andrew

Simulated dataset:

library(tidyverse)
library(rstan)

N_fields <- 3
sigma_p <- 3
p <- rnorm(N_fields, 0, sigma_p)

N_plots <- 235
pop <- sample(N_fields, N_plots, replace = T)
sigma_u = 0.1
u <- rnorm(N_plots, p[pop], sigma_u)

N_obs = 35
N = N_plots * N_obs
N_miss = 20               # change as needed
miss = sample(1:N_obs, N_miss)

N_trt = 7
trt = sample(N_trt, N_plots, replace = T)
beta = rnorm(N_trt, 0, 1)
beta[1] = 0
beta[3] = -4

delta = 0.9
sigma_e = .5

y = matrix(ncol = N_plots, nrow = N_obs)
y[1, ] = beta[trt] + u + rnorm(N_plots) * sigma_e / sqrt(1 - delta^2)

for(i in 2:N_obs) {
  mu = beta[trt] + u
  y[i, ] = rnorm(N_plots, delta * y[i-1, ] + mu, sigma_e)
}

x <- as.data.frame(y) %>%
  rowid_to_column("t") %>%
  gather(p, y, -t) %>%
  mutate(p = as.numeric(as.factor(p)),
         trt = rep(trt, each = N_obs),
         pop = rep(pop, each = N_obs),
         mis = if_else(t %in% miss, 1, 0),
         y = if_else(mis == 1, 0, y)) %>%
  arrange(t, p)

ggplot(x, aes(x = t, y = y, group = p)) +
  geom_line() + 
  facet_grid(pop ~ trt)

data_list <- list(
  N = N,
  N_mis = sum(x$mis),
  N_trt = N_trt,
  N_grp = 1,
  N_plt = N_plots,
  N_pop = N_fields,
  y_obs = x$y,
  missing = x$mis,
  trt = x$trt,
  plt = x$p,
  pop = x$pop
)

str(data_list)

Stan model:

data {
  int<lower=1> N;       // total
  int<lower=0> N_mis;   // missing
  int<lower=1> N_grp;   // types
  int<lower=1> N_trt;   // contrasts
  int<lower=1> N_plt;   // experiments
  int<lower=1> N_pop;   // populations
  
  vector[N] y_obs;      // abundance
  vector[N] missing;    // indicator

  int<lower=1, upper=N_trt> trt[N];
  int<lower=1, upper=N_plt> plt[N];
  int<lower=1, upper=N_pop> pop[N]; 
}
transformed data{
  // lagged response + indices
  int N_ini = N_grp * N_plt;
  int N_ip1 = N_ini + 1;
  int Nm_ini = N - N_ini;
  vector[Nm_ini] y_lag = y_obs[1:Nm_ini];
  }
parameters{
  // missing data
  vector[N_mis] y_mis;
  
  // predictors
  vector[N_trt - 1] beta;
  real mu_b;
  real<lower=0> sigma_b;
  
  // correlation
  real<lower=0, upper=1> delta_raw;
  
  // random effects
  vector[N_pop] p;
  vector[N_plt] u_raw;
  real<lower=0> sigma_u;
  
  // obs variance
  real<lower=0> sigma_e;
}
transformed parameters{
  real delta;
  vector[N_plt] u;
  vector[N] mu;
  
  // non-centered parameters
  delta = delta_raw * 2 - 1;
  u = u_raw * sigma_u;
  
  {
    vector[N_trt] beta_ref = append_row(0.0, beta);
    mu = p[pop] + beta_ref[trt] + u[plt];
  }
}
model{
  real om_dsq = 1 - delta^2;
  vector[N] y;
  
  //add missing data
  int m = 1;
  for(i in 1:N) {
    if(missing[i] == 1){
      y[i] = y_mis[m];
      m = m + 1;
    } else {
      y[i] = y_obs[i];
    }
  }

  // initial observations
  y[1:N_ini] ~ normal(mu[1:N_ini], sigma_e / sqrt(om_dsq));
  
  // remainder include lag term
  y[N_ip1:N] ~ normal(delta * y_lag + mu[N_ip1:N], sigma_e);
  
  // priors
  beta ~ normal(mu_b, sigma_b);
  mu_b ~ normal(0, 1);
  sigma_b ~ normal(0, 1);
  
  delta_raw ~ beta(4, 4);
  
  p ~ normal(0, 5);
  u_raw ~ normal(0, 1);
  sigma_u ~ normal(0, 1);
  sigma_e ~ normal(0, 1);
}

I wrote the R package ctsem for this kind of thing. It sets up this kind of model using Stan and has a variety of fitting approaches. For the super fast intro see the examples section of the ctStanFit function, or here’s some related works:

Hierarchical Bayesian continuous time system formulation
https://www.researchgate.net/publication/324093594_Hierarchical_Bayesian_Continuous_Time_Dynamic_Modeling

Modelling of exogenous inputs
https://www.researchgate.net/publication/323457904_Understanding_the_time_course_of_interventions_with_continuous_time_dynamic_models

New software paper – discussing Bayesian approach only but can be adjusted for frequentist / non-linear, though I still have to write about that :)
https://www.researchgate.net/publication/310747987_Introduction_to_Hierarchical_Continuous_Time_Dynamic_Modelling_With_ctsem

Original software paper – SEM oriented, frequentist mixed effects

1 Like

Thanks Charles, it’s a very impressive package! It ran a little slow on my simple model but I’ve found another solution from the econometrics literature.

(Millmet & McDonough, 2013; https://doi.org/10.1002/jae.2548) describe a method of quasi-differencing which weighs the contribution of each parameter by the number of time-steps between observations.

Because my model is primarily random-effects I chucked out their imputation of covariates and backwards-mean instrument (without fully understanding what it was) and was surprised to see it converged nicely and recovered the parameters well, except for some bias in two of the standard deviations.

Simplified version of eqn. 18 below. One down-side is that I also discarded the first observation. Appreciate any suggestions on how I might get around this.

# Data simulation from above
# ...

# Calculate gaps between observations
z <- filter(x, mis == 0) %>%
  group_by(p) %>%
  mutate(m = dense_rank(t),
         g_m = t - lag(t, default = 1),
         y_lag = lag(y, default = 0)) %>%
  filter(m >= 2)

# Run model with irregular observations
data_list <- list(
  N = nrow(z),
  N_trt = max(z$trt),
  N_grp = 1,
  N_plt = max(z$p),
  N_pop = max(z$pop),
  y_obs = z$y,
  y_lag = z$y_lag,
  g_m = z$g_m,
  trt = z$trt,
  plt = z$p,
  pop = z$pop
)

str(data_list)
data {
  int<lower=1> N;       // total
  int<lower=1> N_grp;   // types
  int<lower=1> N_trt;   // contrasts
  int<lower=1> N_plt;   // experiments
  int<lower=1> N_pop;   // populations
  
  vector[N] y_obs;      // abundance
  vector[N] y_lag;      // lagged obs.
  vector[N] g_m;        // time between obs.

  int<lower=1, upper=N_trt> trt[N];
  int<lower=1, upper=N_plt> plt[N];
  int<lower=1, upper=N_pop> pop[N]; 
}
parameters{
  // predictors
  vector[N_trt - 1] beta;
  real mu_b;
  real<lower=0> sigma_b;
  
  // correlation
  real<lower=0, upper=1> delta_raw;
  
  // random effects
  vector[N_pop] p;
  vector[N_plt] u_raw;
  real<lower=0> sigma_u;
  
  // obs. error
  real<lower=0> sigma_e;
}
transformed parameters{
  real delta;
  vector[N_plt] u;
  vector[N] mu;
  
  // non-centered parameters
  delta = delta_raw * 2 - 1;
  u = u_raw * sigma_u;
  
  {
    vector[N_trt] beta_ref = append_row(0.0, beta);
    mu = p[pop] + beta_ref[trt] + u[plt];
  }
}
model{
  {  
    vector[N] delta_gm = exp(log(delta) * g_m);
    vector[N] gm_adj = (1 - delta_gm) ./ (1 - delta);
    vector[N] mu_adj = gm_adj .* mu;
    vector[N] sigma_adj = gm_adj * sigma_e;
    
    y_obs ~ normal(delta_gm .* y_lag + mu_adj, sigma_adj);
  }
    
  // priors
  beta ~ normal(mu_b, sigma_b);
  mu_b ~ normal(0, 1);
  sigma_b ~ normal(0, 1);
  
  delta_raw ~ beta(4, 4);
  
  p ~ normal(0, 5);
  u_raw ~ normal(0, 1);
  sigma_u ~ normal(0, 1);
  sigma_e ~ normal(0, 1);
}

Just had a quick look but yeah, once the data are discretized appropriately standard missing data approaches also work well. Sometimes this discretization leads to huge amounts of missingness, and sometimes it is preferable to estimate the continuous time generative parameters directly (rather than the effects at some discrete time lag). Re speed of ctsem – yeah I should probably change the defaults to the more typical mixed effects rather than fully random effects, not sure if this was an issue depends how deep you looked into it… good that you’re on track anyway :)