Is there an easy non-modelling way for visualizing different prior distributions?

Normally distributed priors are easy to be simulated and visualized without doing any modelling (which takes some extra time). For example, with ggplot:

dist = rnorm(1000, 15, 8)
ggplot(NULL, aes(dist)) + geom_density()

image

Is there an easy and similar option for checking cauchy, laplace, student t and other prior distributions?

2 Likes

You can plot simulations from any available probability distribution, or alternatively, you can directly plot the density function for a given distribution. For example:

library(tidyverse)
library(VGAM)  # For laplace distribution
library(patchwork) # For plot layouts
theme_set(theme_bw())

# Sample from distribution
p1=ggplot(tibble(x=rt(1000, df=4)), aes(x)) +
  geom_density()

p2=ggplot(tibble(x=rlaplace(1000, location=0, scale=4)), aes(x)) +
  geom_density()

p3=ggplot(tibble(x=rlnorm(1000, meanlog=0, sdlog=0.5)), aes(x)) +
  geom_density()

# Plot multiple sets of random draws from a distribution
p4=ggplot() +
  replicate(10, geom_density(data=tibble(x=rlaplace(100, location=0, scale=4)), aes(x)))

list(p1,p2,p3,p4) %>% wrap_plots(nrow=2)

# Plot density function
p5 = ggplot() +
  stat_function(fun=dcauchy, args=list(location=1, scale=2), 
                xlim=c(-10,10))

p6=ggplot() +
  stat_function(fun=dnorm, args=list(mean=2, sd=3), 
                xlim=c(-10,15))

p7=ggplot() +
  stat_function(fun=dt, args=list(df=4), 
                xlim=c(-20,20), n=1000)

p8=ggplot() +
  stat_function(fun=dlaplace, args=list(location=0, scale=4), 
                xlim=c(-20,20), n=1000)

list(p5,p6,p7,p8) %>% wrap_plots(nrow=2)

Created on 2021-08-01 by the reprex package (v2.0.0)

7 Likes

This is very useful! Does the same approach allow plotting different lkj_corr_cholesky priors?

For univariate density plots, you can plot the density for an LKJ prior for 2x2 correlation matrices, or marginal plots of LKJ priors for larger correlation matrices. Below are examples.

library(tidyverse)
theme_set(theme_bw() +
            theme(plot.title=element_text(size=rel(1), hjust=0.5)))
library(ggdist)

# Plot LKJ Density
lkj_dfunc = function(K, etas=c(0.1, 0.5, 1, 1.5, 2, 5)) {
  
  etas = set_names(etas)
  cor = seq(-1,1,length=200)
  
  etas %>% 
  map_df(
    ~ dlkjcorr_marginal(cor, K=K, eta=.x)
  ) %>% 
  bind_cols(x=cor) %>% 
  pivot_longer(cols=-x, names_to='eta', values_to='density') %>% 
  ggplot(aes(x=x, y=density)) + 
    geom_area(colour=hcl(240,100,30), fill=hcl(240,100,65)) +
    facet_wrap(vars(eta)) +
    scale_y_continuous(expand=expansion(c(0,0.05))) +
    coord_cartesian(ylim=c(0,2)) +
    labs(title=paste0(ifelse(K==2, "", "Marginal "),
                      "LKJ correlation distribution for K=", K,
                      " and various values of eta"),
         x="Correlation") 
}

lkj_dfunc(2)

lkj_dfunc(5)

# Plot density of random draws from LKJ distribution
lkj_rfunc = function(K, N=1000, etas=c(0.1, 0.5, 1, 1.5, 2, 5)) {
  
  etas = set_names(etas)
  
  etas %>% 
    map_df(
      ~ rlkjcorr_marginal(N, K=K, eta=.x)
    ) %>% 
    pivot_longer(cols=everything(), names_to='eta', values_to='corr') %>% 
    ggplot(aes(x=corr)) + 
      geom_density(colour=hcl(240,100,30), fill=hcl(240,100,65)) +
      facet_wrap(vars(eta)) +
      scale_y_continuous(expand=expansion(c(0,0.05))) +
      labs(title=paste0(ifelse(K==2, "D", "Marginal d"),
                        "ensity of random draws from LKJ distribution for K=", K,
                        " and various values of eta"),
           x="Correlation")
}

lkj_rfunc(2)

lkj_rfunc(5)

Created on 2021-08-14 by the reprex package (v2.0.1)

2 Likes

A related question: Is there an easy way to plot what priors look like after an inverse link function has been applied – e.g., what does a normal(0, 1) prior on the logit scale look like on the probability scale?

Below is an example for normally distributed log(odds) priors centered at zero and with a range of standard deviations.

Note that as the standard deviation moves downward from one towards zero, most of the log(odds) density piles up near zero, (i.e., odds near 1:1), so most of the probability density is near 0.5.

On the other hand, when the standard deviation is greater than one, most of the probability quickly starts piling up near 0 and 1. For example, for SD=5, 69% of the density is in log(odds) less than -2 or greater than 2 (1 - diff(pnorm(c(-2,2), 0, 5))), meaning 69% of the probability density is less than 0.12 or greater than 0.88. For SD=10, 75% of the log(odds) density is less than -3 or greater than 3, meaning 75% of the probability density is less than 0.05 or greater than 0.95.

library(LaplacesDemon)  # For invlogit function
library(tidyverse)
theme_set(ggthemes::theme_tufte(base_size=13) +
            theme(axis.text.y=element_blank(),
                  axis.ticks.y=element_blank(),
                  panel.border=element_rect(fill=NA)))
library(patchwork) # For plot layout

set.seed(10)
d = c(0.1, 0.2, 0.5, 1, 2, 5, 10) %>% set_names() %>%  
  map_df(~ tibble(`log(odds)` = rnorm(1000, 0, .x), 
                  probability = invlogit(`log(odds)`)),
         .id="sd") %>% 
  mutate(sd = factor(sd, levels=unique(sd))) 

p1 = d %>% 
  ggplot() +
    geom_density(aes(x=`log(odds)`, y=..scaled..)) +
    facet_grid(rows=vars(sd), scales="free_y", switch="y") +
    scale_y_continuous(expand=expansion(c(0,0.05))) +
    labs(x="log(odds)", y="Standard Deviation") +
    theme(strip.text.y.left=element_text(angle=0))

p2 = d %>% 
  ggplot() +
    geom_density(aes(x=probability, y=..scaled..), adjust=0.5) +
    facet_grid(rows=vars(sd), scales="free_y") +
    scale_y_continuous(expand=expansion(c(0,0.05))) +
    scale_x_continuous(expand=expansion(c(0.001,0.001)), breaks=c(0,0.5,1)) +
    labs(x="probability", y=NULL) +
    theme(strip.text.y=element_text(angle=0))

p1 + p2 + 
  plot_layout(widths=c(5,1)) + 
  plot_annotation(title="Normally distributed log(odds) priors and transformation to probability scale", 
                  theme=theme(plot.title=element_text(hjust=0.5)))

Created on 2021-08-16 by the reprex package (v2.0.1)

4 Likes

That’s beautiful! Thank you.

Part of the reason for asking was because I was interested in what prior would result in a flat probability distribution in a probit regression. Substituting this

d = c(0.1, 0.2, 0.5, 1, 2, 5, 10) %>% set_names() %>%  
    map_df(~ tibble(`log(odds)` = rnorm(1000000, 0, .x), 
                    probability = pnorm(`log(odds)`,0,1)),
           .id="sd") %>% 

into your code gives this lovely plot:

So a standard normal prior it is! I expect this is well-known analytically, but I had been unable to find a reference for it. Note that for a logistic regression (logit link), the unit logistic distribution prior is flat on the probability scale, and this distribution can be approximated by student_t(7.763, 0, 1.566) (Northrup and Gerber, 2018) – also easy to demo with @joels code. I mention all this simply because it may be of general interest.

4 Likes