Time of evening predictor in a categorical model

torricallan · May 12, 2022, 6:01am

I’m working on a categorical model (y = (1, 2, 3, 4)) where each category represents a distinct stage of sleep for a 5 minute period. A key predictor of sleep stage is the time of evening, where deep sleep might be more prevalent before midnight, and REM sleep more prevalent in the early hours of the morning.

Time of evening (x) is modelled using two trigonometric functions
\mu_1 \propto \beta_1 \sin(2 \pi x / 24) + \beta_2 \cos(2 \pi x / 24),
for each state (1, 2, 3). The 4th state is derived from the remaining 3 using the softmax function. Trigonometric functions allow hours in the evening to be represented as close to each other, without a discontinuity at midnight.

As per the standard advice, I simulated some data to understand how to best fit and recover trigonometric functions that predict a categorical outcome.

library(tidyverse)
library(cmdstanr)
library(posterior)
library(lubridate)
library(tidybayes)
library(scales)
library(bayesplot)

options(mc.cores = parallel::detectCores())

time_frame <- data.frame(timestamp_of_evening = seq(as_datetime("2022-04-01 20:00:00"), as_datetime("2022-04-02 06:00:00"), by = "min")) %>%
  mutate(time_of_evening = hour(timestamp_of_evening) + minute(timestamp_of_evening)/60)

amplitude_1_sin <- -1.0
amplitude_1_cos <- -1.0

amplitude_2_sin <- 1.0
amplitude_2_cos <- 1.0

amplitude_3_sin <- 0.25
amplitude_3_cos <- 0.25

f_1 <- amplitude_1_sin*sin(2*pi*time_frame$time_of_evening/24) + amplitude_1_cos*cos(2*pi*time_frame$time_of_evening/24)
f_2 <- amplitude_2_sin*sin(2*pi*time_frame$time_of_evening/24) + amplitude_2_cos*cos(2*pi*time_frame$time_of_evening/24)
f_3 <- amplitude_3_sin*sin(2*pi*time_frame$time_of_evening/24) + amplitude_3_cos*cos(2*pi*time_frame$time_of_evening/24)

log_sum_exp <- log(exp(f_1) + exp(f_2) + exp(f_3))

log_softmax_1 <- f_1 - log_sum_exp
log_softmax_2 <- f_2 - log_sum_exp
log_softmax_3 <- f_3 - log_sum_exp
log_softmax_4 <- 0 - log_sum_exp

p_1 <- exp(log_softmax_1)
p_2 <- exp(log_softmax_2)
p_3 <- exp(log_softmax_3)
p_4 <- exp(log_softmax_4)

y <- c()
for (i in 1:nrow(time_frame)){
  y[i] <- sample(c(1, 2, 3, 4), size = 1, replace = TRUE, prob = c(p_1[i], p_2[i], p_3[i], p_4[i]))
}

I’ve fit the simulated data with the following stan model

data {
  int<lower=1> K; // number of states
  int<lower=1> N;
  array[N] int y;
  vector[N] time_of_evening;
}
parameters {
 array[2] vector[K-1] time_of_evening_trig;
 corr_matrix[K-1] time_of_evening_corr;
 vector<lower=0>[K-1] time_of_evening_scale; 
 }
transformed parameters{
  vector[N] mu1;
  vector[N] mu2;
  vector[N] mu3;
  array[N] vector[K] mu;
  
  for (n in 2:N){
    mu1[n] = time_of_evening_trig[1,1]*sin(2*pi()*time_of_evening[n]/24) + time_of_evening_trig[2,1]*cos(2*pi()*time_of_evening[n]/24);
    mu2[n] = time_of_evening_trig[1,2]*sin(2*pi()*time_of_evening[n]/24) + time_of_evening_trig[2,2]*cos(2*pi()*time_of_evening[n]/24);
    mu3[n] = time_of_evening_trig[1,3]*sin(2*pi()*time_of_evening[n]/24) + time_of_evening_trig[2,3]*cos(2*pi()*time_of_evening[n]/24);
    mu[n] = transpose([mu1[n], mu2[n], mu3[n], 0]);
  }
}
model {
  time_of_evening_scale ~ cauchy(0, 2.5);
  time_of_evening_corr ~ lkj_corr(2);
  time_of_evening_trig ~ multi_normal(rep_vector(0, K-1), quad_form_diag(time_of_evening_corr, time_of_evening_scale));

  for (n in 2:N)
      y[n] ~ categorical_logit(mu[n]);
}

datalist <- list(K = 4, N = nrow(time_frame), y = y, time_of_evening = time_frame$time_of_evening)
mod <- cmdstan_model("sin_model.stan")
fit <- mod$sample(data = datalist, chains = 4, iter_warmup = 1000, iter_sampling = 1000, adapt_delta = 0.99)

Even after setting adapt_delta = 0.99 I still get some divergent transitions. I’m not sure how to articulate this point clearly, but I suspect that the periodicity in one state ends up being correlated with the periodicity in another state. I’ve experimented with a multivariate prior, but as I don’t understand the issue well I’m not sure that this has helped.

The issue is that the model doesn’t quite fit the generated data - the shape of the curves are close to what is simulated, but there is a small amount of bias. In particular, the estimates for the parameters relating to the probability of being in state 1 are biased away from the true value (vertical dashed line), which results in a significant departure of the predicted probability of being in state 1 early in the evening, when compared to the simulated curve.

When I sample from the prior distribution only, the sampler performs much more poorly. I wondered at this point if that is because having a periodic predictor per state over-states the amount of information at hand. We know that if the chances of being in state 1 is higher earlier in the night, then we already know that the chances of being in states 2, 3 or 4 is much lower, without having to use time of evening as a predictor.

From the above, I then tried this model

transformed parameters{
  vector[N] mu1;
  vector[N] mu2;
  vector[N] mu3;
  array[N] vector[K] mu;
  
  for (n in 2:N){
    mu1[n] = time_of_evening_trig[1,1]*sin(2*pi()*time_of_evening[n]/24) + time_of_evening_trig[2,1]*cos(2*pi()*time_of_evening[n]/24);
    mu2[n] = time_of_evening_trig[1,2]*sin(2*pi()*time_of_evening[n]/24) + time_of_evening_trig[2,2]*cos(2*pi()*time_of_evening[n]/24);
    mu3[n] = time_of_evening_trig[1,3]*sin(2*pi()*time_of_evening[n]/24) + time_of_evening_trig[2,3]*cos(2*pi()*time_of_evening[n]/24);
    mu[n] = transpose([mu1[n], 0, 0, 0]);
  }
}
model {
  time_of_evening_scale ~ cauchy(0, 2.5);
  time_of_evening_corr ~ lkj_corr(2);
  time_of_evening_trig ~ multi_normal(rep_vector(0, K-1), quad_form_diag(time_of_evening_corr, time_of_evening_scale));

  for (n in 2:N)
      y[n] ~ categorical_logit(mu[n]);
}

Where only the first state has a time of evening predictor. The resulting posterior predictions are shown below, where the fit for state 1 is still biased, and the fit for the other states is well off.

It seems to me that the way this model is parameterised is causing an issue with sampling. I’m not sure how to reparameterise it (or maybe use a different representation for time of evening will help), or if my diagnosis of the problem is correct. Any help would be much appreciated. Thanks

torricallan · May 12, 2022, 9:05am

Quick update - if I take out the predictor for the third state, and use univariate priors the model doesn’t seem to have any sampling issues. The fit is as good as using a predictor for three states, but still slightly biased.

transformed parameters{
  vector[N] mu1;
  vector[N] mu2;
  vector[N] mu3;
  array[N] vector[K] mu;
  
  for (n in 2:N){
    mu1[n] = time_of_evening_trig[1,1]*sin(2*pi()*time_of_evening[n]/24) + time_of_evening_trig[2,1]*cos(2*pi()*time_of_evening[n]/24);
    mu2[n] = time_of_evening_trig[1,2]*sin(2*pi()*time_of_evening[n]/24) + time_of_evening_trig[2,2]*cos(2*pi()*time_of_evening[n]/24);
    //mu3[n] = time_of_evening_trig[1,3]*sin(2*pi()*time_of_evening[n]/24) + time_of_evening_trig[2,3]*cos(2*pi()*time_of_evening[n]/24);
    mu[n] = transpose([mu1[n], mu2[n], 0, 0]);
  }
}
model {
  time_of_evening_trig[1] ~ std_normal();
  time_of_evening_trig[2] ~ std_normal();

  for (n in 2:N)
      y[n] ~ categorical_logit(mu[n]);
}

Topic		Replies	Views
Results from playing with periodic models Modeling specification	15	2255	June 20, 2024
Poor sampling in model of periodic response including a rate parameter Modeling	24	2114	January 17, 2018
Advice requested on using categorical_lpmf() in a complex model Modeling	4	517	November 5, 2025
Categorical Model is taking really long for estimation Modeling	2	592	November 24, 2020
Fitting a sine function by specifying a circular predictor brms fitting-issues , brms	5	254	February 11, 2025

Time of evening predictor in a categorical model

Related topics