Hurdle lognormal variance

Given the parameters \mu, \sigma, and hu, how does one compute the variance of the hurdle lognormal distribution? Based on this part of the brms Github, I know we can compute the mean for the hurdle lognormal as

\exp(\mu + \sigma^2 / 2) \cdot (1 - hu)

Here’s how that works in action.

library(tidyverse)

# define population parameters
mu <- 1
sigma <- 1
hu <- 0.25

# number of sample draws
n <- 1e6

# simulate
set.seed(1)

tibble(y = c(rep(0, times = n * hu),
             rlnorm(n = n * (1 - hu), meanlog = mu, sdlog = sigma))) %>% 
  # summarize
  summarise(sample_mean = mean(y),
            mean_by_formula = exp(mu + sigma^2 / 2) * (1 - hu))
# A tibble: 1 × 2
  sample_mean mean_by_formula
        <dbl>           <dbl>
1        3.36            3.36

But again, how does one compute the variance of the hurdle lognormal distribution?

Think of the hurdle lognormal as a mixture of a lognormal and a distribution whose pdf is a delta function at zero. For the derivation of the variance of a mixture of one-dimensional distributions, see Mixture distribution - Wikipedia under the Moments heading.

Sadly, that looks to be over my head.

I see why it looks that way, but I’ve seen you around enough to feel pretty confident that it’s not :)

Focus on the final line here, which is an expression for the variance of the mixture:

1 Like

From Geebo Samuel on the bird site (link), we learn the variance for the hurdle lognormal is

(\exp(\sigma^2) - (1 - hu)) \cdot (1 - hu) \cdot \exp(2 \mu + \sigma^2)

Here’s what that looks like in code.

library(tidyverse)

# define population parameters
mu <- 1
sigma <- 1
hu <- 0.25

# number of sample draws
n <- 1e6

# simulate
set.seed(1)

tibble(y = c(rep(0, times = n * hu),
             rlnorm(n = n * (1 - hu), meanlog = mu, sdlog = sigma))) %>% 
  # summarize
  summarise(sample_var = var(y),
            var_by_formula = (exp(sigma^2) - (1 - hu)) * (1 - hu) * exp(2 * mu + sigma^2))
# A tibble: 1 × 2
  sample_var var_by_formula
       <dbl>          <dbl>
1       29.4           29.7
1 Like

For a nice scholarly reference, see Smith et al (2014; https://doi.org/10.1002/sim.6263).

1 Like