Centering Prior on MLEs in categorical logistic regression

jphill01 · November 4, 2023, 5:34pm

I have the following Stan model:

data {
  int<lower=1> N; //number of samples = 281
  int<lower=1> numSources; // number of sources = 3 (grocery store, restaurant, sushi bar)
  int<lower=1> numStates; // number of states = 2 (cooked, raw)
  int<lower=1> numAppearances; // number of appearances = 2 (moified, plain)
  int<lower=1> numForms; // number of forms = 4 (chunk, chopped, fillet, whole)
  int<lower=1> numColours; // number of colours = 2 (light, red)

  int<lower=0, upper=1> Mislabelled[N]; // vector of mislabelled samples
  int<lower=1, upper=numSources> Source[N]; // vector of sources
  int<lower=1, upper=numStates> State[N]; // vector of states
  int<lower=1, upper=numAppearances> Appearance[N]; // vector of appearances
  int<lower=1, upper=numForms> Form[N]; // vector of forms
  int<lower=1, upper=numColours> Colour[N]; // vector of colours
}

parameters {
  real beta_0;

  vector[numSources] beta_source;
  vector[numStates] beta_state;
  vector[numAppearances] beta_appearance;
  vector[numForms] beta_form;
  vector[numColours] beta_colour;
}

model {
  vector[N] eta;

  //priors

 // need to fill in with means and standard deviations

  beta_0 ~ normal();

  beta_source ~ normal();
  beta_state ~ normal();
  beta_appearance ~ normal();
  beta_form ~ normal();
  beta_colour ~ normal();

  for (i in 1:N) {
    eta[i] = beta_0 + beta_source[Source[i]] * Source[i] + 
                 beta_state[State[i]] * State[i] +
                 beta_appearance[Appearance[i]] * Appearance[i] +
                 beta_form[Form[i]] * Form[i] +
                 beta_colour[Colour[i]] * Colour[i];
  }

  // likelihood
  Mislabelled ~ bernoulli_logit(eta);
}

generated quantities {
  real odds_intercept;
  vector[numSources] odds_source;
  vector[numStates] odds_state;
  vector[numAppearances] odds_appearance;
  vector[numForms] odds_form;
  vector[numColours] odds_colour;

  real prob_intercept;
  vector[numSources] prob_source;
  vector[numStates] prob_state;
  vector[numAppearances] prob_appearance;
  vector[numForms] prob_form;
  vector[numColours] prob_colour;

  odds_intercept = exp(beta_0);
  odds_source = exp(beta_source);
  odds_state = exp(beta_state);
  odds_appearance = exp(beta_appearance);
  odds_form = exp(beta_form);
  odds_colour = exp(beta_colour);

  prob_intercept = odds_intercept / (1 + odds_intercept) * 100;
  prob_source = odds_source ./ (1 + odds_source) * 100;
  prob_state = odds_state ./ (1 + odds_state) * 100;
  prob_appearance = odds_appearance ./ (1 + odds_appearance) * 100;
  prob_form = odds_form ./ (1 + odds_form) * 100;
  prob_colour = odds_colour ./ (1 + odds_colour) * 100;

}

I want to be able to centre my normal priors (normal()') on their MLEs inferred from R::glm()`. However, I’m unsure how to do this with the current model encoding.

Also, since each of my predictors is categorical, I should choose a reference level. Currently the model outputs all L levels, rather than L-1 of them. R::glm() selects the first level of the reference automatically (if levels are already alphabetized as they are here). Probably the easiest way to do this would be as follows:

if (i > 1) {
     beta[i] - beta[1]
     ...
}

Any thoughts on how to proceed with these?

harrelfe · November 5, 2023, 11:21am

That seems to be double dipping. Your usage of data (MLE) information in the prior will make the effective sample size of the posterior larger than the actual sample size. Don’t you want to use MLEs as initial values in sampling instead?

jphill01 · November 5, 2023, 6:55pm

@harrelfe I’ve seen several papers by colleagues centre their priors on MLEs… They are probably not Bayesians. Regardless, the prior should be selected independently of seeing any of the data (or a statistical summary thereof). Given this, I’m curious about your suggestion to use the MLEs as starting conditions instead. Wouldn’t this have the same consequence that you bring up in your response? It would speed up convergence to the posterior distribution though.

harrelfe · November 6, 2023, 12:57pm

First of all I have seldom needed to override the default random starting values in Stan. But to your question, no it’s not he same, unless the sampling goes haywire so that the results after thousands of posterior draws was indeed dependent on the choice of initial values.

I also run the Stan optimizer then feed this MLE into the sampler as starting values.

Topic		Replies	Views
Priors for binomial trials in generalized linear models Modeling	3	512	July 6, 2018
Multiple logistic regression Modeling rstan	1	575	January 18, 2022
Prior statement in stan Algorithms	1	345	September 14, 2021
Fitting a logistic model with only categorical variables Modeling	4	470	March 4, 2023
Inferring very small parameters when implementing non-centered hierarchical model Modeling fitting-issues	1	47	February 4, 2025

Centering Prior on MLEs in categorical logistic regression

Related topics