Defining constraints in variable definiton vs programatically

Let’s say I have a random variable with a Beta distribution and i want to constrain it to be more than 0.5. There are two ways:

model {
  real<lower=0.5, upper=1> psi; 
  psi ~ beta(8.0, 2.0); 
} 

The other option is:

model {
  real psi; 
  psi_temp ~ beta(8.0, 2.0);
  psi = 1 - 0.5 * psi_temp; 

Why do these two ways yield different behavior in stan?

Because they are defining two different distributions. The first way is truncating the beta distribution, the second way is rescaling and shifting the beta distribution. You might find the following code and plots helpful:

library(ggplot2)

# histograms
samples <- rbeta(n = 100000, 8, 2)
truncated_samples <- samples[samples > 0.5]
transformed_samples <- 1 - 0.5 * samples

withr::with_par(list(mfrow = c(1, 2)), {
  hist(truncated_samples)
  hist(transformed_samples)
})

pointwise_truncated_function <- function(x, shape1, shape2) {
  if (x > 0.5) {
    dbeta(x, shape1, shape2) / (1 - pbeta(0.5, shape1, shape2))
  } else {
    0
  }
}

truncated_function <- Vectorize(pointwise_truncated_function)

transformed_function <- function(x, shape1, shape2) {
  dbeta(2 * (1 - x), shape1, shape2) * (1 / 0.5) 
}

# plot of density functions
ggplot() +
  xlim(c(0, 1)) +
  stat_function(fun = dbeta, args = list(shape1 = 8, shape2 = 2)) +
  stat_function(fun = truncated_function, args = list(shape1 = 8, shape2 = 2), col = 'red') +
  stat_function(fun = transformed_function, args = list(shape1 = 8, shape2 = 2), col = 'blue')

Created on 2021-05-27 by the reprex package (v1.0.0)
The red curve is your first way (truncation), and the blue curve is your second way (rescaling).

3 Likes

This helps a lot, thank you!