I think something strange is going on in your model. I don’t think this is quite as simple as ‘without data it should just be the prior for alpha’, but I can’t quite put my finger on why just yet.

If we boil this down to the simplest example, we see that Stan and the naive Monte Carlo estimates are identical (~~I think JAGS might be clever enough to sample from a gamma directly in this instance, so no need to double check~~ It is doing slice, JAGS does slice for everything).

```
library(rstan)
library(ggplot2)
model_txt <- "parameters {real <lower = 0> alpha;} model {alpha ~ gamma(1.0, 1.0);}"
gamma_model <- stan_model(model_code = model_txt)
stan_samples <- sampling(gamma_model, iter = 5e4) %>%
rstan::extract("alpha") %>%
unlist()
mc_samples <- rgamma(1e5, 1.0, 1.0)
plot_df <- data.frame(
x = c(stan_samples, mc_samples),
grp = rep(c("stan", "mc"), each = 1e4)
)
quantiles <- seq(from = 0.001, to = 0.999, length.out = 500)
qq_df <- data.frame(
stan_quantile <- quantile(stan_samples, quantiles),
mc_quantiles <- quantile(mc_samples, quantiles)
)
# Indistinguishable
ggplot(qq_df, aes(x = stan_quantile, y = mc_quantiles)) +
geom_point(col = "red") +
geom_abline(slope = 1, intercept = 0)
ggplot(plot_df, aes(x = x, color = grp)) +
geom_density()
```

Looking at the pairs plot of *all* the parameters from your latest `stan_vs_jags_reprex.R`

suggests the prior on `p`

is inappropriate (far too much mass at 0 and 1), which is breaking the adaptation in some way, and makes the samples of \alpha non-representative. Indeed, if I turn up `adapt_delta`

on your original example, I get warnings about divergences and max_treedepth.

I suspect this is making the adaptation break in a strange way, and you end up with an inappropriate stepsize, which can’t move far enough per iteration (the gamma prior has a lot of mass at the asymptote at zero, and when this gets log-transformed it can’t move far enough towards -\infty to correctly sample the prior).