Gaussian distribution for logit-transformed binomial response to add an auto-regressive structure

Max_Mantei · June 13, 2019, 10:11pm

This is my approach after thinking about it and remembering that @jonah had something really clever in his pest control example in the tutorial in StanCon Helsinki (2018).

One problem might be that the time intervals are not evenly spaced. I’m not sure, but I think a continuous AR(1) process is actually a Ornstein-Uhlenbeck process. Also, you might want to check out the R ctsem by @Charles_Driver. I’ve never used it before, but I guess it’s exactly for stuff like this.

Anyways, this code runs the model for one of the ID (in this example ID == 3). You would have to extend it to more groups. Also, this code runs the AR(1) prior on the specific time points – since these are unevenly spaced, I also included two lines, which let’s you run the AR(1) process to model the correlation between days (it also models missing days automatically)…

library(tidyverse)
library(readr)
library(rstan)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)

data <- read_csv("data1.csv", 
                 col_types = cols(X1 = col_skip())) %>% 
  arrange(utc) %>% 
  filter(ID == 3) %>% 
  mutate(#day = lubridate::floor_date(utc, unit = "day"), # un-comment if AR(1) by days
         #time = as.numeric(difftime(day, min(day), units = "days")) + 1) # un-comment if AR(1) by days
         time = 1:n()) # comment out if AR(1) by days

stan_data <- list()
stan_data <- within(stan_data,{
  N <- nrow(data)
  X <- model.matrix(~ 0 + var1 + var2 + var3 + var4 + var5, data = data)
  K <- ncol(X)
  t <- max(data$time)
  y <- data$y
  time_idx <- data$time
})

posterior <- stan(file = "stan_model.stan", data = stan_data)
print(posterior)

Where stan_model.stan is:

data{
  int<lower=0> N;
  int<lower=0> K;
  int y[N];
  matrix[N,K] X;
  int t;
  int<lower=1,upper=t> time_idx[N];
}
transformed data{
  matrix[N,K] Q = qr_thin_Q(X)*sqrt(N);
  matrix[K,K] R_inv = inverse(qr_thin_R(X)/sqrt(N));
}
parameters{
  real alpha;
  vector[K] beta_tilde;
  real<lower=0> sigma_time;
  real<lower=0,upper=1> rho_raw;
  vector[t] time_raw;
}
transformed parameters{
  vector[N] mu = alpha + Q*beta_tilde;

  // AR(1) process priors
  // (inspired by Jonah Gabry's Helsinki Stan tutorial)
  real rho = 2.0 * rho_raw - 1.0;
  vector[t] time = sigma_time * time_raw;
  time[1] /= sqrt(1 - rho^2);
  for (i in 2:t) {
    time[i] += rho * time[i-1];
  }
  
  mu += time[time_idx];
}
model{
    
  y ~ bernoulli_logit(mu);

  alpha ~ normal(-5, 5);
  beta_tilde ~ normal(0, 2.5);
  rho_raw ~ beta(10, 5);
  time_raw ~ std_normal();
  sigma_time ~ std_normal();

}
generated quantities{
  vector[K] beta = R_inv*beta_tilde;
}

Model for time points:
First is just the time variable from the model code, second is inv_logit(mu).

Model for days:
First is just the time variable from the model code, second is inv_logit(mu).

If all of this makes sense, I don’t know. But maybe this’ll give you a place to start.

Topic		Replies	Views
Correlation effects not produced with CAR structure specified brms	3	1439	June 21, 2021
Brms: fitting negative binomial model with an AR-correlation structure Modeling fitting-issues , brms	1	1338	June 7, 2021
Beta-Binomial: Why not a default family in the package? brms	2	1772	August 28, 2019
Spatial simultaneous autoregressive (SAR) structures for binomial regression using BRMS brms fitting-issues , specification	1	471	September 24, 2021
R package "brms" model specification Modeling rstan , specification , brms	2	240	May 1, 2024

Gaussian distribution for logit-transformed binomial response to add an auto-regressive structure

Related topics