Rejection of Metropolis proposal for Bernoulli_logit_lmpf

SamNel · April 24, 2024, 9:24am

Hey Stan Community!

I am pretty new but quite hooked to coding with Stan. I am building Stan models for Rescorla Wagner models currently, similarly to hBayesDM does it. In essence, we are trying to sample for the hyperparameters of the model outputting a sequence of decisions (0 or 1, depending on the response was “Go” or “NoGo”).

My model is running fine during sampling, but I keep getting this error every now and then (mostly for every chain, a few times):

Ey3kLfskBNPFiYnTrx8Kx3-output-4.csv: Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Ey3kLfskBNPFiYnTrx8Kx3-output-4.csv: Exception: bernoulli_logit_lpmf: Logit transformed probability parameter is -nan, but must be not nan! (in ‘/project/3017083.01/behavioral_study/scripts/matlab_scripts/RescorlaWagnerModel/stanModels/DynamicMotivationalBiasModel.stan’, line 95, column 16 to column 94)
Ey3kLfskBNPFiYnTrx8Kx3-output-4.csv: If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Ey3kLfskBNPFiYnTrx8Kx3-output-4.csv: but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

It comes up systematically, so I am unsure whether I should ignore this or not. This is my model:

data {
    int<lower=1> nTrials;
    int<lower=1> nBlocks;
    int<lower=1> nSubjects; // When doing model parameter retrieval, this is 1.
    int<lower=1, upper=4> state[nSubjects, nBlocks, nTrials]; // initialize with lower and upper bounds
    int<lower=0, upper=1> action[nSubjects, nBlocks, nTrials];
    int<lower=-1, upper=1> feedback[nSubjects, nBlocks, nTrials];
}

transformed data {
   vector[4] initQ; // specify init values for Q
   initQ = [0.5, -0.5, 0.5, -0.5]';
}

parameters {

   vector[4] mu_pr; // define location of prior distribution of ep and rho
   vector<lower=0>[4] sigma; // define spread of prior distribution of ep and rho
   matrix[nSubjects, nBlocks] ep_pr;        // learning rate epsilon, prior
   matrix[nSubjects, nBlocks] rho_pr;         // rho, feedback sensitivity, prior
   matrix[nSubjects, nBlocks] goBias_pr;
   matrix[nSubjects, nBlocks] pi_pr;
}

transformed parameters {
    // Define actual epsilon and rho.
    matrix<lower=0, upper=1>[nSubjects, nBlocks] ep;
    matrix<lower=0>[nSubjects, nBlocks] rho;
    matrix[nSubjects, nBlocks] goBias;
    matrix[nSubjects, nBlocks] pi;

    // Generate epsilon values based on priors.
    for (s in 1:nSubjects) {
        for (b in 1:nBlocks){
            ep[s, b]  = Phi_approx(mu_pr[1] + sigma[1] * ep_pr[s, b]);
        }
    }
        
    // Confine rho to a strictly positive range (exp() does that)
    rho = exp(mu_pr[2] + sigma[2] * rho_pr);
    goBias = mu_pr[3] + sigma[3] * goBias_pr;
    pi = mu_pr[4] + sigma[4] * pi_pr;
 }

model {
    // Sample the mu_pr vector and sigma vector for priors.
    mu_pr[1:2] ~ normal(0, 1.0);
    mu_pr[3:4] ~ normal(0, 10.0);
    sigma[1:2] ~ normal(0, 0.2);
    sigma[3:4] ~ cauchy(0, 1.0);

    for (s in 1:nSubjects) {
        // individual parameters w/ Matt trick
        // Reparameterization
        ep_pr[s] ~ normal(0, 1.0);
        rho_pr[s] ~ normal(0, 1.0);
        goBias_pr[s] ~ normal(0, 1.0);
        pi_pr[s] ~ normal(0, 1.0);

        for (b in 1:nBlocks){
            // Reset for every block (closest to how people might perceive it)
            vector[4] w_g;  // action weight for go
            vector[4] w_ng; // action weight for nogo
            vector[4] q_g;  // Q value for go
            vector[4] q_ng; // Q value for nogo
            vector[4] pGo;   // prob of go (press)
            vector[4] sv;

            // Init column vectors.
            w_g  = initQ*rho[s, b];
            w_ng = initQ*rho[s, b];
            q_g  = initQ*rho[s, b];
            q_ng = initQ*rho[s, b];
            sv = initQ*rho[s, b];
                
            for (t in 1:nTrials){
                w_g[state[s, b, t]] = q_g[state[s, b, t]] + goBias[s, b] + pi[s, b] * sv[state[s, b, t]];
                w_ng[state[s, b, t]] = q_ng[state[s, b, t]];

                // Equivalent to bernoulli(logit_inv()).

                action[s, b, t] ~ bernoulli_logit(w_g[state[s, b, t]] - w_ng[state[s, b, t]]);                
                
                sv[state[s,b,t]] += ep[s, b] * (rho[s, b] * feedback[s, b, t] - sv[state[s, b, t]]);

                if (action[s, b, t]) { // update go value
                    q_g[state[s, b, t]] += ep[s, b] * (rho[s, b] * feedback[s, b, t] - q_g[state[s, b, t]]);
                } else { // update no-go value
                    q_ng[state[s, b, t]] += ep[s, b] * (rho[s, b] * feedback[s, b, t] - q_ng[state[s, b, t]]);
                } // end of if clause

            } // end of t loop
        } // end of b loop
    } // end of s loop
} // end of model block

I am coding in Matlab, using MatlabStan. The data I feed into the model is of the following form:

d =

struct with fields:

  nTrials: 320
  nBlocks: 8
nSubjects: 1
    state: [1×8×320 double]
   action: [1×8×320 double]
 feedback: [1×8×320 double]

Since this regards the bernoulli_logit function, which should output 1 if the “coin flip” is Go, and 0 if it isn’t:

“action” is originally a matrix, filled with 1 for “Go” and 2 for “NoGo”. I map every 2 to 0, so that "action is filled with 1’s and 0’s, depending on “Go” or “NoGo”.

I am happy to provide more details if necessary. I don’t fully understand why the bernoulli_logit function would produce nans. This has occured even in my simplest models, so I guess it is something with my prior definition for the action matrix?

Any help is greatly appreciated!!!

Best
Samu

Corey.Plate · April 24, 2024, 4:46pm

If it’s only happening at the beginning of the program, that’s normal. If it’s happening throughout, even if infrequently, there may be a modeling issue.

Topic		Replies	Views
Metropolis rejection proposal due to incorrect numerical values for derived parameter Modeling	4	2194	March 23, 2018
Metropolis proposal rejected because location parameter is infinite Modeling	2	2297	November 6, 2018
Rejection of Metropolis proposal General rstan	3	785	July 26, 2023
Parameters during warmup far outside prior range Modeling	16	2067	February 26, 2018
Current Metropolis proposal is often rejected CmdStan cmdstanr	7	1065	December 16, 2022

Rejection of Metropolis proposal for Bernoulli_logit_lmpf

Related topics