Fitting the true model and getting bad k-pareto from loo

nitzan_shahar · October 9, 2021, 1:21pm

Hi All,

I am trying to run model recovery for Q-learning RL model (fit with the true and wrong model, and compare using loo). Parameter recovery seems to be very good, and overall it seems like the model is predicting the simulated data. However, I get very bad diagnostic from the loo package… I saw that this was discussed a few years ago, but I didn’t find how to solve this. Really frustrating since my aim is to go for more complicated RL models, but I can’t seem to get the basic model recovery.

Please - any thoughts on what I might be doing wrong here?

Thank you!
Nitzan


  //General fixed parameters for the experiment/models
  int<lower = 1> Nsubjects;           //number of subjects
  int<lower = 1> Ntrials;             //maximum number of trials per subject (without missing data). Used to form the subject x trials matricies. 
  int<lower = 1> Ntrials_per_subject[Nsubjects];  //number of trials left for each subject after data omission
  int<lower = 2> Narms;       //number of overall alternatives
  int<lower = 2> Nraffle;       //number of offers per trial


  //Behavioral data:
  //each variable being a subject x trial matrix
  //the data is padded in make_standata function so that all subjects will have the same number of trials
  int<lower = 0> action[Nsubjects,Ntrials];        //index of which arm was pulled coded 1 to 4
  int<lower = 0> reward[Nsubjects,Ntrials];            //outcome of bandit arm pull
  int<lower = 0> offer1[Nsubjects,Ntrials];            //outcome of bandit arm pull
  int<lower = 0> offer2[Nsubjects,Ntrials];            //outcome of bandit arm pull
  int<lower = 0> selected_offer[Nsubjects,Ntrials];            //outcome of bandit arm pull
}

transformed data{
  int<lower = 1> Nparameters=2; //number of parameters
  vector[Narms] Qvalue_initial;     // initial values for Qvalues (defined here to aviod doing this many times across iterations)
  Qvalue_initial = rep_vector(0.0, Narms);
}

parameters {
// Declare parameters vectors. the notation "aux" indicate that the values are before transformation
  //population level parameters 
  vector[Nparameters] mu_aux;                    //vector with the population level mean for each model parameter
  vector<lower=0>[Nparameters] sigma_aux;          //vector of random effects variance for each model parameter
  
//individuals level
  vector[Nsubjects] alpha_individal_aux;
  vector[Nsubjects] beta_indvidial_aux;
}


transformed parameters {
//declare variables and parameters
  vector<lower=0, upper=1>[Nsubjects]  alpha;
  vector<lower=0, upper=10>[Nsubjects] beta;
    
  for (subject in 1:Nsubjects) {
    alpha[subject]   = Phi_approx(mu_aux[1]  + sigma_aux[1]  * alpha_individal_aux[subject]);
    beta[subject]    = Phi_approx(mu_aux[2] + sigma_aux[2] * beta_indvidial_aux[subject]) * 10;
  }

}



model {
  
  // population level priors (hyper-parameters)
  mu_aux  ~ normal(0, 1);            
  sigma_aux ~ normal(0, 0.5);        

  // indvidual level priors (subjects' parameters)
  alpha_individal_aux~ normal(0, 1);
  beta_indvidial_aux ~ normal(0, 1);
 

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//Likelihood function per subject per trial

  for (subject in 1:Nsubjects){
    vector[Narms] Qcard; 
    vector[Nraffle] Qoffer; 
    
    Qcard=Qvalue_initial;
         
      for (trial in 1:Ntrials_per_subject[subject]){
          Qoffer[1]=Qcard[offer1[subject,trial]];
          Qoffer[2]=Qcard[offer2[subject,trial]];

        //liklihood function 
        selected_offer[subject, trial] ~ categorical_logit(beta[subject] * Qoffer);
            
        //Qvalues update
        Qcard[action[subject,trial]] += alpha[subject] * (reward[subject,trial] - Qcard[action[subject,trial]]);

      } 
  }
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
}

generated quantities {
  //define parameters that will be saved in the final output
  real<lower=0, upper=1>  mu_alpha;
  real<lower=0, upper=10> mu_beta;
  real log_lik[Nsubjects];
  real y_pred[Nsubjects, Ntrials];


  // Set all posterior predictions to -1 (avoids NULL values)
  for (i in 1:Nsubjects) {
    for (t in 1:Ntrials) {
      y_pred[i, t] = -1;
    }
  }
mu_alpha=Phi_approx(mu_aux[1]);
mu_beta=Phi_approx(mu_aux[2])*10;

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//Likelihood function per subject per trial (placed in generetaed quantities block to save time and memory)

  { // 
    for (subject in 1:Nsubjects) {
        vector[Narms] Qcard; 
        vector[Nraffle] Qoffer; 


        Qcard=Qvalue_initial;

        log_lik[subject] = 0;

        for (trial in 1:Ntrials_per_subject[subject]){
        //offer values
          Qoffer[1]=Qcard[offer1[subject,trial]];
          Qoffer[2]=Qcard[offer2[subject,trial]];
          
        // compute log likelihood of current trial
        log_lik[subject] += categorical_logit_lpmf(selected_offer[subject, trial] | beta[subject] * Qoffer);

        // generate posterior prediction for current trial
        //y_pred[subject, trial] = categorical_rng(softmax(beta[subject] * Qoffer));
        y_pred[subject, trial] = softmax(beta[subject] * Qoffer)[selected_offer[subject, trial]];
 
        //Qvalues update
        Qcard[action[subject,trial]] += alpha[subject] * (reward[subject,trial] - Qcard[action[subject,trial]]);      
        }
    }
  }
}


Computed from 12000 by 20 log-likelihood matrix

         Estimate    SE
elpd_loo  -1875.5  83.0
p_loo        32.6   2.0
looic      3751.0 166.0
------
Monte Carlo SE of elpd_loo is NA.

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)      0     0.0%   <NA>      
 (0.5, 0.7]   (ok)        0     0.0%   <NA>      
   (0.7, 1]   (bad)      16    80.0%   21        
   (1, Inf)   (very bad)  4    20.0%   9         
See help('pareto-k-diagnostic') for details.

avehtari · October 11, 2021, 10:01am

You are not necessarily doing anything wrong. loo package uses Pareto smoothed importance sampling for a fast computation, but that computation can fail for very flexible models, e.g. when you have one parameter for each observation. This is mentioned in the LOO package glossary — loo-glossary • loo , and in Cross-validation FAQ.

You could switch to using K-fold-CV as described in the vignette Holdout validation and K-fold cross-validation of Stan programs with the loo package • loo

Topic		Replies	Views
Asking for advice on making a more efficient RL code Modeling	4	515	July 17, 2021
How can I solve the problem of high Pareto-k values? Modeling	3	1068	June 26, 2023
LOO Model Comparison Alternative Modeling rstan , techniques , loo , cmdstanr	3	90	March 27, 2025
Interpret pareto k diagnostic Modeling rstan , fitting-issues , loo	3	1623	August 3, 2023
Near-zero learning rate Modeling	0	91	May 9, 2024

Fitting the true model and getting bad k-pareto from loo

Related topics