MAP and MCMC produce different scales

mburn · November 30, 2023, 4:20pm

Here’s my puzzle: When I fit am IRT model with MCMC I get a scale that ranges from about -1 to 1.5. When I fit the exact same model with the same data and priors with MAP I get a scale that ranges from -.1 to 0.8. I can’t figure out why that is or if it’s something I should be worried about.

The details:
I’m fitting a model for what is essentially a two parameter IRT model. I have count data so I’m using a binomial distribution, and then I put the IRT in the probability parameter of the distribution. There are 72 items in the model, the variable I’m interested in is Theta.

data {
  int<lower=1> J; // number of people
  int<lower=1> K; // number of items
  int<lower=1> N; // number of observations
  int<lower=1,upper=J> jj[N]; // user for observation n
  int<lower=1,upper=K> kk[N]; // item for observation n
  int X[N]; // Total attempts for user J
  int<lower=0> y[N]; // count of successful attempts for observation n
  
}

parameters {
  vector[K] delta; // item intercept
  vector[K] alpha; // discrimination/slope parameter
  vector[J] theta; // ideology/ability
}

model {
  delta ~ normal(0, 1); // item intercept,
  alpha ~ normal(0, 2); // discrimination/slope parameter
  theta ~ normal(0, 1); // ideology/ability
  
  y ~ binomial_logit(X, delta[kk] + alpha[kk] .* theta[jj]);
}

I fit the model with some weakly informed priors of -1 or 1 to solve reflective invariance. When I fit the model with MCMC it behaves well. Chains converge, nothing divergent, ESS > 400, Rhat ~ 1, etc. As far as I can tell there is no sign that the models are misbehaving. The results of the two models are essentially perfectly correlated.

So why are the theta scales in different locations, is it something I should worry about, and how do I solve it? Thanks for any insights.

Bob_Carpenter · November 30, 2023, 5:54pm

The posterior mean from MCMC and the posterior mode from a MAP estimate are different estimators. They are only the same when the distribution is symmetric (like for a normal). With a skewed distribution, they’re different. Consider a simple \textrm{Beta}(\alpha, \beta) distribution. The posterior mean is \frac{\alpha}{\alpha + \beta}, whereas the mode where you’d put an MLE (or MAP estimate) is \frac{\alpha - 1}{\alpha + \beta - 2}.

You usually need to initialize to try to finesse things this way.

Is this really an ideal point model where you don’t know the orientations? Because the simple IRT models don’t have this problem. Gelman and Hill’s regression book discusses this, but I don’t know if it made it into the revision of the first half of the book with Vehtari.

No, nothing to worry about. No solution necessary. You’ll find if you do calibration tests that the solution from MCMC will be calibrated (when you simulate data from the model) and the MAP estimate will not be calibrated. Generally, it’s a better idea to use the MCMC estimates when it’s tractable to compute them.

mburn · November 30, 2023, 6:26pm

Amazing answer, thank you so much!

Topic		Replies	Views
How to interpret MCMC model result Modeling techniques , fitting-issues , mcmc , interpret-results	4	1598	August 23, 2021
MCMC Chains with Large Variance Modeling fitting-issues	1	411	August 23, 2020
Effective sample size and sample size Modeling fitting-issues	7	906	April 29, 2022
Draws from the Posterior Do Not Vary Modeling	2	484	June 16, 2019
Reducing SD on Theta in count IRT Modeling	2	284	March 22, 2023

MAP and MCMC produce different scales

Related topics