MCMC approximation of posterior for a simple model of human attention (Divergent transitions)

mrnoodle · November 19, 2022, 10:44pm

Hey friends! I’m working on a probabilistic model of human attention. The model is defined by the following problem:

We want to learn the \mu and \sigma of Gaussian random variable y via Bayesian inference. We have some prior beliefs about P(\mu) ~ (normal dist) and P(\sigma) ~ (gamma dist), which we want to update given observations. However, we cannot observe y directly. Instead, we can only take noisy samples z ~ N( y, noise), where noise itself has a gamma prior.

Stan Model

data {
    int<lower=1> F; // number of features
    int<lower=1> M; // total number of noisy samples
    int<lower=1> K; // number of y's (exemplars)
    matrix[F, M] z; // noisy samples (rows are features, columns are samples)

    int<lower=1> exemplar_idx[M]; // list of indices of size M

    // hyper priors
    real mu_mean;
    real<lower=0> mu_sd;

    real<lower=0> sigma_alpha;
    real<lower=0> sigma_beta;

    real<lower=0> noise_alpha; // check what priors to provide
    real<lower=0> noise_beta;

}
parameters {
    vector[F] mu;
    vector<lower=0>[F] sigma;
    matrix[F, K] y;
    real<lower=0> noise;

}
model {

    noise ~ gamma(noise_alpha, noise_beta);

    // loop through features
    for (f in 1:F){

        mu[f] ~ normal(mu_mean, mu_sd);
        sigma[f] ~ gamma(sigma_alpha,sigma_beta);

        // loop through y's
        for (k in 1:K){
            y[f, k] ~ normal(mu[f], sigma[f]);
        }

        // multiple z observations
        for (m in 1:M){
            z[f, m] ~ normal(y[f, exemplar_idx[m]], noise);
        }
    } 
}
generated quantities {

vector[F] z_rep;

for (f in 1:F){
        z_rep[f] = y[f, K] + normal_rng(0, noise);
 }  

}

And an example data dictionary (where z is the observation):

data = 
{'mu_mean': 0,
 'mu_sd': 0.5,
 'sigma_alpha': 2,
 'sigma_beta': 2,
 'epsilon_alpha': 1,
 'epsilon_beta': 1,
 'noise': 0.4,
 'F': 1,
 'noise_alpha': 7.5,
 'noise_beta': 1,
 'M': 1,
 'K': 1,
 'z': array([[1.04543573]]),
 'exemplar_idx': [1]}

I’m using MCMC in CmdStanPy to get samples from the approximate mu and sigma posteriors. I’m getting many divergent transitions (dozens per chain) but I’m not sure what needs to be adjusted. Is there an issue with how the stan model is defined / parameter settings or do I need to try a different implementation of MCMC? Thank you!

Topic		Replies	Views
"Bayesian Cognitive Modeling" in cmdstanr CmdStan cognitive-science	5	1207	September 6, 2021
Memory retention case study: modeling full individual differences RStan cognitive-science	1	549	January 13, 2020
Problems with posteriors of parameters in Bayesian hierarchical model (cmdstanpy) Modeling cmdstanpy , hierarchical-model , python	1	597	August 22, 2023
Modelling test scores Modeling	5	800	November 5, 2019
Case study: ODE-based models and multimodality Publicity ode , case-study	12	1153	February 25, 2021

MCMC approximation of posterior for a simple model of human attention (Divergent transitions)

Related topics