**EDIT:** Turns out I really was confused about marginalization. I think I finally figured it out. Edited to only show what I believe is the correct solution. No more answers needed, unless you believe the current solution is wrong.

So I’ve been trying to understand Bayes factors and BMA better. Here’s a model that combines two candidate submodels, indexed by a single binary discrete parameter z with a uniform prior:

The posterior probability P(z = 1 | \mathbf{y}) then corresponds to the BMA weight of the “null” submodel (\mu = 0) against an “intercept” submodel (\mu = \alpha).

I’ve thought this would pose no problems as Andrew and others always recommends that the best approach for model selection is just building the supermodel that contains the smaller submodels.

We can’t do discrete parameters in Stan, but a direct way to reparametrize for Stan is to introduce s = P(z = 1) as an additional parameter, so we can have a Stan model:

```
data {
int K;
vector[K] y;
}
parameters {
real alpha;
}
transformed parameters {
real log_lik_null = normal_lpdf(y | 0, 1);
real log_lik_intercept = normal_lpdf(y | alpha, 1);
}
model {
target += log_mix(0.5, log_lik_null, log_lik_intercept);
target += normal_lpdf(alpha | 0, 2);
}
```

Note that we deliberately mix the full likelihoods - either all observations are from model 1 or all are from model 2.

I can then use the Bayes rule and since z and \alpha are a-prior independent, I can assume P(z = 1 | \alpha) = P(z = 1) = \frac{1}{2}. This gives:

So to get the desired P(z = 1 | \mathbf{y}) I need to integrate out \alpha, which I can approximate via posterior samples:

The result is in agreement with the analytical value and other ways to compute the BMA weight and Bayes factor:

```
y <- c(0.5,0.7, -0.4, 0.1)
full_model <- cmdstan_model("2021-bayes-factors.stan")
fit_full <- full_model$sample(data = list(K = length(y), y = y),
refresh = 0, iter_sampling = 10000)
full_samples <-posterior::as_draws_matrix(fit_full$draws())
rel_p_null <- exp(full_samples[,"log_lik_null"])
rel_p_intercept <- exp(full_samples[,"log_lik_intercept"])
p_null <- mean(rel_p_null / (rel_p_null + rel_p_intercept))
p_null
# 0.7863209
p_null / (1 - p_null) # The Bayes factor
# 3.679916
```

For other seeking more examples, changepoint model from Stan user’s guide proved useful…