Is there an easy non-modelling way for visualizing different prior distributions?

di4oi4 · August 1, 2021, 1:22pm

Normally distributed priors are easy to be simulated and visualized without doing any modelling (which takes some extra time). For example, with ggplot:

dist = rnorm(1000, 15, 8)
ggplot(NULL, aes(dist)) + geom_density()

Is there an easy and similar option for checking cauchy, laplace, student t and other prior distributions?

joels · August 1, 2021, 5:22pm

You can plot simulations from any available probability distribution, or alternatively, you can directly plot the density function for a given distribution. For example:

library(tidyverse)
library(VGAM)  # For laplace distribution
library(patchwork) # For plot layouts
theme_set(theme_bw())

# Sample from distribution
p1=ggplot(tibble(x=rt(1000, df=4)), aes(x)) +
  geom_density()

p2=ggplot(tibble(x=rlaplace(1000, location=0, scale=4)), aes(x)) +
  geom_density()

p3=ggplot(tibble(x=rlnorm(1000, meanlog=0, sdlog=0.5)), aes(x)) +
  geom_density()

# Plot multiple sets of random draws from a distribution
p4=ggplot() +
  replicate(10, geom_density(data=tibble(x=rlaplace(100, location=0, scale=4)), aes(x)))

list(p1,p2,p3,p4) %>% wrap_plots(nrow=2)

# Plot density function
p5 = ggplot() +
  stat_function(fun=dcauchy, args=list(location=1, scale=2), 
                xlim=c(-10,10))

p6=ggplot() +
  stat_function(fun=dnorm, args=list(mean=2, sd=3), 
                xlim=c(-10,15))

p7=ggplot() +
  stat_function(fun=dt, args=list(df=4), 
                xlim=c(-20,20), n=1000)

p8=ggplot() +
  stat_function(fun=dlaplace, args=list(location=0, scale=4), 
                xlim=c(-20,20), n=1000)

list(p5,p6,p7,p8) %>% wrap_plots(nrow=2)

^{Created on 2021-08-01 by the reprex package (v2.0.0)}

di4oi4 · August 14, 2021, 8:12am

This is very useful! Does the same approach allow plotting different lkj_corr_cholesky priors?

joels · August 14, 2021, 10:00pm

For univariate density plots, you can plot the density for an LKJ prior for 2x2 correlation matrices, or marginal plots of LKJ priors for larger correlation matrices. Below are examples.

library(tidyverse)
theme_set(theme_bw() +
            theme(plot.title=element_text(size=rel(1), hjust=0.5)))
library(ggdist)

# Plot LKJ Density
lkj_dfunc = function(K, etas=c(0.1, 0.5, 1, 1.5, 2, 5)) {
  
  etas = set_names(etas)
  cor = seq(-1,1,length=200)
  
  etas %>% 
  map_df(
    ~ dlkjcorr_marginal(cor, K=K, eta=.x)
  ) %>% 
  bind_cols(x=cor) %>% 
  pivot_longer(cols=-x, names_to='eta', values_to='density') %>% 
  ggplot(aes(x=x, y=density)) + 
    geom_area(colour=hcl(240,100,30), fill=hcl(240,100,65)) +
    facet_wrap(vars(eta)) +
    scale_y_continuous(expand=expansion(c(0,0.05))) +
    coord_cartesian(ylim=c(0,2)) +
    labs(title=paste0(ifelse(K==2, "", "Marginal "),
                      "LKJ correlation distribution for K=", K,
                      " and various values of eta"),
         x="Correlation") 
}

lkj_dfunc(2)

lkj_dfunc(5)

# Plot density of random draws from LKJ distribution
lkj_rfunc = function(K, N=1000, etas=c(0.1, 0.5, 1, 1.5, 2, 5)) {
  
  etas = set_names(etas)
  
  etas %>% 
    map_df(
      ~ rlkjcorr_marginal(N, K=K, eta=.x)
    ) %>% 
    pivot_longer(cols=everything(), names_to='eta', values_to='corr') %>% 
    ggplot(aes(x=corr)) + 
      geom_density(colour=hcl(240,100,30), fill=hcl(240,100,65)) +
      facet_wrap(vars(eta)) +
      scale_y_continuous(expand=expansion(c(0,0.05))) +
      labs(title=paste0(ifelse(K==2, "D", "Marginal d"),
                        "ensity of random draws from LKJ distribution for K=", K,
                        " and various values of eta"),
           x="Correlation")
}

lkj_rfunc(2)

lkj_rfunc(5)

^{Created on 2021-08-14 by the reprex package (v2.0.1)}

andymilne · August 15, 2021, 4:06am

A related question: Is there an easy way to plot what priors look like after an inverse link function has been applied – e.g., what does a normal(0, 1) prior on the logit scale look like on the probability scale?

joels · August 16, 2021, 6:50pm

Below is an example for normally distributed log(odds) priors centered at zero and with a range of standard deviations.

Note that as the standard deviation moves downward from one towards zero, most of the log(odds) density piles up near zero, (i.e., odds near 1:1), so most of the probability density is near 0.5.

On the other hand, when the standard deviation is greater than one, most of the probability quickly starts piling up near 0 and 1. For example, for SD=5, 69% of the density is in log(odds) less than -2 or greater than 2 (1 - diff(pnorm(c(-2,2), 0, 5))), meaning 69% of the probability density is less than 0.12 or greater than 0.88. For SD=10, 75% of the log(odds) density is less than -3 or greater than 3, meaning 75% of the probability density is less than 0.05 or greater than 0.95.

library(LaplacesDemon)  # For invlogit function
library(tidyverse)
theme_set(ggthemes::theme_tufte(base_size=13) +
            theme(axis.text.y=element_blank(),
                  axis.ticks.y=element_blank(),
                  panel.border=element_rect(fill=NA)))
library(patchwork) # For plot layout

set.seed(10)
d = c(0.1, 0.2, 0.5, 1, 2, 5, 10) %>% set_names() %>%  
  map_df(~ tibble(`log(odds)` = rnorm(1000, 0, .x), 
                  probability = invlogit(`log(odds)`)),
         .id="sd") %>% 
  mutate(sd = factor(sd, levels=unique(sd))) 

p1 = d %>% 
  ggplot() +
    geom_density(aes(x=`log(odds)`, y=..scaled..)) +
    facet_grid(rows=vars(sd), scales="free_y", switch="y") +
    scale_y_continuous(expand=expansion(c(0,0.05))) +
    labs(x="log(odds)", y="Standard Deviation") +
    theme(strip.text.y.left=element_text(angle=0))

p2 = d %>% 
  ggplot() +
    geom_density(aes(x=probability, y=..scaled..), adjust=0.5) +
    facet_grid(rows=vars(sd), scales="free_y") +
    scale_y_continuous(expand=expansion(c(0,0.05))) +
    scale_x_continuous(expand=expansion(c(0.001,0.001)), breaks=c(0,0.5,1)) +
    labs(x="probability", y=NULL) +
    theme(strip.text.y=element_text(angle=0))

p1 + p2 + 
  plot_layout(widths=c(5,1)) + 
  plot_annotation(title="Normally distributed log(odds) priors and transformation to probability scale", 
                  theme=theme(plot.title=element_text(hjust=0.5)))

^{Created on 2021-08-16 by the reprex package (v2.0.1)}

andymilne · August 16, 2021, 8:52pm

That’s beautiful! Thank you.

andymilne · August 16, 2021, 9:30pm

Part of the reason for asking was because I was interested in what prior would result in a flat probability distribution in a probit regression. Substituting this

d = c(0.1, 0.2, 0.5, 1, 2, 5, 10) %>% set_names() %>%  
    map_df(~ tibble(`log(odds)` = rnorm(1000000, 0, .x), 
                    probability = pnorm(`log(odds)`,0,1)),
           .id="sd") %>%

into your code gives this lovely plot:

So a standard normal prior it is! I expect this is well-known analytically, but I had been unable to find a reference for it. Note that for a logistic regression (logit link), the unit logistic distribution prior is flat on the probability scale, and this distribution can be approximated by student_t(7.763, 0, 1.566) (Northrup and Gerber, 2018) – also easy to demo with @joels code. I mention all this simply because it may be of general interest.

Topic		Replies	Views
Some Bayesplot/ggplot2 questions relating to plotting priors and posteriors in same panel General bayesplot , cmdstanr	5	1966	February 1, 2022
Newbie questions on how Stan weights the likelihood by the prior + implications for visualization Modeling brms	17	901	June 23, 2023
Visualize Bayesian model General	4	996	December 18, 2023
Correct to illustrate flat priors as a line from -Inf to Inf? Modeling brms	1	351	September 9, 2021
Scaling of ppc_dens_overlay for prior predictive distribution Modeling bayesplot	1	469	October 11, 2022

Is there an easy non-modelling way for visualizing different prior distributions?

Related topics