Model Help: Rolling Regression and Varying Slopes

Hi all,

I tend to use brms to do modelling with Stan and am a little out of my depth when it comes to implementing some types of model. I was wondering if anyone could help.

Imagine that I want to model how two variables, x_1 and x_2 affect some binary variable y. But assume also that I have some prior knowledge that their effects will change over time. For the avoidance of doubt, I have data on multiple respondents at each time point.

To model this, I use a “rolling regression” as follows:

y_{i} \sim \mathrm{Bernoulli}(\pi_{I}) \\ logit(\pi_{i}) = \alpha_{t} + \beta_{1t} x_{1ti} + \beta_{2t} x_{2ti}\\ \alpha_{1} \sim \mathrm{Normal}(0, 1.5) \\ \alpha_{t} \sim \mathrm{Normal}(\alpha_{t-1}, \sigma_{\alpha}) \text{ for } t \in [2, \dots, T] \\ \beta_{1,1} \sim \mathrm{Normal}(0, 0.5) \\ \beta_{1,t} \sim \mathrm{Normal}(\beta_{1, t-1}, \sigma_{\beta_{1}}) \text{ for } t \in [2, \dots, T] \\ \beta_{2,1} \sim \mathrm{Normal}(0, 0.5) \\ \beta_{2,t} \sim \mathrm{Normal}(\beta_{2, t-1}, \sigma_{\beta_{2}}) \text{ for } t \in [2, \dots, T] \\

What do I need to do to specify a covariance structure to allow for pooling across \alpha, \beta_{1}, and \beta_{2}? (Any other tips welcome too!)

At the moment, my Stan code is as follows:

data {
  int<lower = 1> N; // Number of cases in the data
  int<lower = 1> T; // Number of days in the data
  int Y[N]; 
  vector[N] x1; 
  vector[N] x2; 
  int<lower = 1> time[N]; // Time tracker
  vector[N] w8; // Weights
}
parameters {
  vector[T] alpha; // Intercept
  vector[T] beta1; 
  vector[T] beta2; 
  real<lower = 0> sigma_alpha; // Intercept variability
  real<lower = 0> sigma_beta1; 
  real<lower = 0> sigma_beta2; 
}
model{
  
  // Initialise linear predictor
  vector[N] mu;
  
  // Specify model
  for(n in 1:N){
    mu[n] = alpha[time[n]] + beta1[time[n]] * x1[n] + beta2[time[n]] * x1[n];
  }
  
  // Adaptive autoregressive prior on alpha
  target += normal_lpdf(alpha[1] | 0, 1.5);
  for(t in 2:T){
    target += normal_lpdf(alpha[t] | alpha[t - 1], sigma_alpha);
  }
  target += exponential_lpdf(sigma_alpha | 2);
  
  // Adaptive autoregressive prior on beta1
  target += normal_lpdf(beta1[1] | 0, 0.5);
  for(t in 2:T){
    target += normal_lpdf(beta1[t] | beta1[t - 1], sigma_alpha);
  }
  target += exponential_lpdf(sigma_beta1 | 2);
  
  // Adaptive autoregressive prior on beta2
  target += normal_lpdf(beta2[1] | 0, 0.5);
  for(t in 2:T){
    target += normal_lpdf(beta2[t] | beta2[t - 1], sigma_alpha);
  }
  target += exponential_lpdf(sigma_beta2 | 2);
  
  // Likelihood functions
  for(n in 1:N){
    target += w8[n] * (bernoulli_logit_lpmf(Y[n] | mu[n]));
  }
}
1 Like

Sorry for not getting to you earlier, your question is relevant and well written. Did you manage to resolve this in the meantime?

I am not sure I completely follow - it seems to me you are actually already pooling (by assuming successive \beta are similar. You may obviously have all \beta_{1,t} drawn from a T dimensional multivariate normal with more correlation between closer time points, which would essentially give you a Gaussian process prior on \beta_{1}.

Alternatively, you could constrain the time evolution of the process even more and use splines for your time-varying parameters…

Note that brms would support both splines and GPs for your predictors.

At first glance however, your model code looks good and if you don’t have any warnings/failed diagnostics, I wouldn’t be too afraid to use it.

Does that make sense?

Hi, it is an interesting discussion you have started, and relevant for others, too. It would be very nice and fruitful to see how the discussion develops if it could continue. Rolling models have been discussed less often I found by using ‘rolling’ as a keyword when searching the topics.
Regards, tsuli