Reduce_sum and finite mixture (log_mix)

sonicking · April 12, 2022, 6:42pm

Hello, I did reduce_sum before following the wonderful case study tutorial. In that example, the likelihood can be vectorized. Is it still possible to use reduce_sum when the likelihood cannot be vectorized? Specifically, I can use log_mix to run a 2-component finite mixture model. But the likelihood cannot be vectorized (as far as I know). Thanks.

wds15 · April 12, 2022, 7:58pm

Not being able to vectorize does not mean you can’t use reduce sum. Can you write your log likelihood as a big sum? If yes, then things are promising…if not…then maybe it’s the sum over a few big sub sums?

It’s about big and costly sums.

sonicking · April 13, 2022, 2:00am

Thank you for your reply. The finite mixture log-likelihood for 2 classes is like this:

for (n in 1:N) {
  target += log_mix(lambda,
                    normal_lpdf(y[n] | mu[1], sigma[1]),
                    normal_lpdf(y[n] | mu[2], sigma[2])) 
  };

A for-loop is needed and can’t be vectorized (according to the Stan manual). If there are more than 2 classes, there will be another for-loop instead of log_mix.

So I don’t know how one can express this log-likelihood as a big sum or a few big sub sums.

jsocolar · April 13, 2022, 3:37pm

This for-loop is a big sum over N terms. That’s what the += is doing; it’s summing another term into the target for each iteration of the loop.

sonicking · April 14, 2022, 2:52pm

Oh. Thanks for pointing me to the right direction. Would this work?

functions {
  real partial_sum_ll(real[] y_slice,
                        int start, int end,
                        vector mu,
                        vector sigma,
                        real lambda) {
    sum=0;
    for (i in start:end) {
        sum += log_mix(lambda,
                    normal_lpdf(y_slice[i] | mu[1], sigma[1]),
                    normal_lpdf(y_slice[i] | mu[2], sigma[2])) 
  };
      return sum;
}

wds15 · April 15, 2022, 7:31am

Almost. The indexing is wrong. Maybe have a look at the within chain parallelization case study?

sonicking · April 16, 2022, 1:44am

Could you tell me where I was wrong?

I read the tutorial again.
It seems to me I should just loop from start to end.

wds15 · April 16, 2022, 6:01am

The y slice runs from 1 to end - start + 1.

sonicking · April 16, 2022, 12:36pm

Oh, I get it now. Thank you very much!

Topic		Replies	Views
How to use reduce-sum for a function returning vector (implementation of Dirichlet Process Mixtures) Developers	11	543	September 18, 2022
Reduce_sum with multiple data likelihoods Modeling	12	1146	May 17, 2020
Vectorizing bernoulli_logit_lpmf() Algorithms	2	1634	July 12, 2017
Storing log-likelihood in model chunk General	4	712	January 28, 2021
Reduce Sum with Multivariate Likelihood not updating variables Modeling	2	470	February 9, 2022

Reduce_sum and finite mixture (log_mix)

Related topics