Visualizing shrinkage of an infinite MLE in a simple logistic example

blokeman · March 13, 2021, 4:03pm

A have a simple mock dataset and a logistic model where two predictor categories exhibit quasi-complete separation. What I want to do is visualize how the resulting infinite MLE is “tamed” by the prior. Below is a minimal example. Even though I use brms, choice of interface should be irrelevant here.

n <- c(96, 48, 24, 12, 6, 3)
x <- factor(LETTERS[1:length(n)])
0.75 -> p
y <- n*c(0.5, p, 1-p, p, 0, 1)
d <- data.frame(y =  unlist(sapply(1:length(y), function(i) c(rep(1, y[i]), rep(0, n[i]-y[i])))),
                x = rep(x, times = n))
Bayfit <- brm(y ~ x, family = bernoulli, data = d, prior =  prior(normal(0,2.5), class = Intercept) + prior(normal(0,2.5), class = b), seed = 3, warmup = 1000, iter = 3500)
Post <- posterior_samples(Bayfit)

With the model thus fit and the posterior samples stored, now we visualize the infinite MLE, the prior and the posterior:

curve(dbinom(0, 6, plogis(x)), -10, 10, lwd = 2, col = "blue", xlab = expression(beta[4]), ylab = "P")
curve(dnorm(x, 0, 2.5), add = TRUE, col = "red", lwd = 2)
lines(density(Post$b_xE), lwd = 2)
legend(4.5, 0.9, legend = c("Prior","Likelihood","Posterior"), col = c("red","blue","black"), lwd = 2, cex = 0.8)

shrinkage

While I’m confident that this illustration is “directionally correct”, I have an uneasy feeling about the blue (likelihood) graph. AFAIK the likelihood, unlike the prior and the posterior, is NOT a PDF and need not integrate to 1. This seems clear from the graph, where the likelihood has by far the largest area under the curve. How big of a problem is this? Is there some simple constant that could or should be applied to the likelihood in order to make it more commensurate to the prior and the posterior? I think the present graph works alright for the purpose of illustrating the action of the prior on the posterior, but it would still suck if it contained some glaring mathematical idiocy. Does it?

js592 · March 13, 2021, 8:53pm

Stan works with the log-likelihood. I’m not too familiar with brms, but it could be worth checking if b_xe is transformed or not.

blokeman · March 14, 2021, 6:56am

AFAIK if I use the log likelihood, then I also need to use the log prior and logs of the densities of the posterior samples. Here’s the only way I can think of doing it:

logdens.E <- sapply(Post$b_xE, function(x) log(density_ratio(Post$b_xE, point = x)))
curve(dbinom(sum(d$y[d$x == "E"]), sum(d$x == "E"), plogis(x), log = TRUE), from = min(Post$b_xE), to = max(Post$b_xE), lwd = 2, col = "blue", xlab = expression(beta[4]), ylab = "P")
curve(dnorm(x, 0, 2.5, log = TRUE), add = TRUE, col = "red", lwd = 2)
points(x = Post$b_xE, y = logdens.E)
legend(-11.5, -1, legend = c("Prior","Likelihood","Posterior"), col = c("red","blue","black"), lwd = 2, cex = 0.8)

shrinkage

Is this more mathematically correct than the first version? Indeed the log of the likelihood doesn’t look as disproportionately large as the likelihood itself. Otherwise, however, this plot is much uglier than the previous one. I don’t know how to get a smooth graph using logged densities of the posterior simulations.

js592 · March 14, 2021, 7:42am

Oh, I mostly thought you might have substituted the log-likelihood for the likelihood in the original version by accident. I think the scale was fine. I do know that is is somewhat common to use a scaled version of the likelihood in the prior + data → posterior plots to make things look nicer.

blokeman · March 14, 2021, 8:01am

The original version used the likelihood, not its log. The new one has logged versions of all three distributions (I’ve now updated the legend to reflect this). I’m glad if there are no obvious mathematical stupidities present in either plot though.

Yeah, I wish I knew the scaling factor.

Topic		Replies	Views
Correct to illustrate flat priors as a line from -Inf to Inf? Modeling brms	1	348	September 9, 2021
Visualize Bayesian model General	4	895	December 18, 2023
Help with understanding of bayesian inference using GLM and BRMS brms	3	1403	November 13, 2018
Fitting brms model and represent graphily Modeling fitting-issues , brms	1	192	May 13, 2024
Pedictive posterior visualization for logistic(or binomial) regression Modeling	5	635	February 25, 2022

Visualizing shrinkage of an infinite MLE in a simple logistic example

Related topics