Hi all.
I was playing around with a binomial model with a beta prior and am interested in calculating the evidence of an observation by marginalising out the probability parameter. I was trying this out on a more complicated model but reduced it to the very simple case below:
data {
int<lower=0> N;
int<lower=0> y;
real<lower=0> alpha;
real<lower=0> beta;
}
parameters {
real<lower=0, upper=1> p;
}
model {
p ~ beta(alpha, beta);
}
generated quantities{
real<upper=0> log_lik;{
log_lik = binomial_lpmf(y | N, p);
}
}
which can be run using
library(cmdstanr)
likelihood_model <- cmdstan_model(stan_file='stan_model.stan')
data = list(
N=100,
y=40,
alpha=10,
beta=10
)
model_fit = likelihood_model$sample(
data=data,
chains=4,
parallel_chains = 4,
iter_warmup=500,
iter_sampling=1000
)
model_fit
My question is about the final bracket:
generated quantities{
real<upper=0> log_lik;{
log_lik = binomial_lpmf(y | N, p);
}
}
What exactly am I calculating here if I define a prior over the parameter p? My initial interpretation would be that it would marginalise over my p prior and estimate the evidence, i.e. it is calculating:
where P(Y|N,p) \sim \text{Bin}(Y|N, p) and P(p|\alpha, \beta) ~ \text{Beta}(\alpha, \beta).
Given that Y, N, \alpha and \beta are all constants, I would have assumed that log_lik
would have returned a fixed value, but looking at the model output I have not generated a single-point value:
> model_fit
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
lp__ -14.40 -14.11 0.76 0.33 -15.90 -13.87 1.00 1894 2009
p 0.49 0.49 0.11 0.12 0.31 0.68 1.00 1628 1676
log_lik -6.97 -4.74 5.90 3.06 -18.68 -2.53 1.00 1704 1676
Can anyone enlighten me as to what log_lik
is the log’d distribution of?