Mixture priors in Stan


May I ask that could we assign a mixture prior for certain paramters in Stan? For exmaple, if I would like to assign a mixture normal prior: lambda*Normal(0, sigma_0) + (1-lambda)*Normal(0, sigma_1) on the one-dimension parameter mu, could I simply write:

target += log_sum_exp(log(lambda) + normal_lpdf(mu|0,sigma_0),
                          log1m(lambda) + normal_lpdf(mu|0,sigma_1));

Could we generalize this idea if mu now is a vector of paramters and each of the component follows the same mixture normal prior?

A more general question is that I am a little bit confused about the different usage of target when applied to prior and likelihood. For example in mixture model we typically loop through every observed data to sum their log-likelihood together. But if we would like to write a target statement for prior, shall we just write the target once(because there is nothing to sum)? For example, are the following two expression equivalent for paramter mu?

mu ~ Normal(0, sigma)


target += normal_lpdf(mu|0, sigma)


There’s a blurb describing this in the manual so I’ll point you there cause it does a better job than I would: 5.4 Vectorizing mixtures | Stan User’s Guide

You gotta stare at the math a while, but currently there’s no way to vectorize mixtures in Stan.

The difference in those two statements will be that mu ~ normal(0, sigma); will be optimized internally to drop any constants unnecessary for MCMC to work (for instance, if sigma is constant). target += normal_lpdf(mu|0, sigma) will accumulate the log of the normalized pdf (like if you computed it with dnorm(..., log = TRUE)).

target += normal_lupdf(mu | 0, sigma) is equivalent to mu ~ normal(0, sigma).

(Side note, this is some syntax we introduced last Fall but when I went looking for it today I realized I forgot to ever finish the dogs ooops (Added chapter on proportionality constants (Issue #287) by bbbales2 · Pull Request #288 · stan-dev/docs · GitHub), so thanks for the reminder)

Thanks so much for your kindly reply!

So basically if mu is a vector of parameter and we would like to assign a mixture prior to it, should we loop through its component like the following:

data {
real<lower = 0> N;
vector[N] x;
vector[N] y;
vector[N] z;
parameter {
vector [3] mu;
vector<lower = 0> [3] sigma_0;
vector<lower = 0> [3] sigma_1;
real<lower = 0> lambda;
for (i in 1:3){
target += log_sum_exp(log(lambda) + normal_lpdf(mu[i]|0,sigma_0[i]),
                          log1m(lambda) + normal_lpdf(mu[i]|0,sigma_1[i]));
x ~ Normal(mu[1], 0);
y ~ Normal(mu[1], 0);
z ~ Normal(mu[1], 0);


There is a log_mix function to make this sort of thing easier to read: 3.14 Composed functions | Stan Functions Reference

But yeah, you gotta loop through things one by one.

x ~ Normal(mu[1], 0);

You’ll want standard deviation > 0 there.

That’s very helpful!

And could I think that the ‘target +=’ expression could be applied for both prior distribution and the likelihood. But typically the hyperparameter/parameter in prior distribution woud be one dimension so we don’t need to vectorize/loop through them, but for data likelihood we almost vectorize/loop through them all the times. However, in summary basically the usage of ‘target +=’ expression is same for them, am I right?


Stan doesn’t differentiate in code between likelihood and priors, so yeah the target += does the same thing always.

Similarly we try to make things as vectorized as we can (it can make stuff go faster) regardless of prior or likelihood, but if it’s not possible, loops it is.