Two-armed bandit hierarchical reinforcement learning model - interpreting conflicting loo and posterior predictive check results

milena_musial · January 16, 2024, 2:13pm

thanks for your reply! As you recommended, I am currently trying to understand why PSIS is failing before using k-fold CV as an alternative strategy.

As you did not mention values for kS and kV, I’m not able to do full count, but I assume that the large number of high khats is due to very flexible model.

You are right, I do have a lot more parameters than just 33 (kS and kV are 1, respectively). This was an error of thought on my side and indeed leads to a highly flexible model. I will try to simplify the model during the next couple days by leaving the random slope out and only keeping a random intercept.

PPC’s you used are not useful for binary target as it is sufficient to have just one intercept parameter to get the proportions of two classes right. It would be better to use calibration or reliability plots as illustrated in Bayesian Logistic Regression with rstanarm

Thanks for the reference! I used the recommended CORP approach by Dimitriadis, Gneiting, Jordan (2021) to create a calibration plot:

rd_PH_withC

# A tibble: 1 × 5
  forecast mean_score miscalibration discrimination uncertainty
  <chr>         <dbl>          <dbl>          <dbl>       <dbl>
1 EMOS         0.0996        0.00203         0.0537       0.151

I’m still having some issues interpreting the plot.

From what I understand, we see the binned model-predicted choice probability of choice = 1 per trial on the x-axis.
The y-axis then shows the observed choice probability of choice = 1 in trials that had a predicted choice probability contained in the respective bin on the x-axis.
In case that’s correct, the plot shows that for trials in which the predicted choice probability is around 55% or lower, the observed choice probability is lower than the predicted choice probability.
In less technical terms, in trials in which the model predicts that choice = 1 is unlikely (<50% choice probability), it actually is even more unlikely?
Any feedback on my interpretation is welcome as I could be totally off.

Why did you get NULL values? If there are -999 values in log_lik , then LOO computation will be garbage.

My log_likelihood matrix initially includes -999 vectors as some participants did not make a choice in some of the 50 trials per condition. I manually exclude columns including -999 before calculating loo with the code pasted below. Does that make sense?

# extract log likelihood for each choice
  log_likelihood <- extract_log_lik(fit, parameter_name = "log_lik", merge_chains = TRUE)

  # exclude missing trials
  log_likelihood <- log_likelihood[,log_likelihood[1,]!=-999]
  
  # print and plot loo
  loo1 <- loo(log_likelihood)
  print(loo1)

Great thanks and best,
Milena

Topic		Replies	Views
Loo for hierarchical model with trial-by-trial dependencies Modeling loo	8	250	April 2, 2025
PSIS-LOO in hierarchical model - only mean log-likelihoods per group work Modeling rstan , loo	8	1046	May 9, 2022
Reinforcement learning model Modeling loo , cognitive-science	18	5473	May 18, 2019
Information criteria - conflicting advice for hierarchical models General	19	2713	January 2, 2020
A quick note what I infer from p_loo and Pareto k values Modeling loo	35	16792	August 21, 2022

Two-armed bandit hierarchical reinforcement learning model - interpreting conflicting loo and posterior predictive check results

Related topics