How to use reduce_sum when lpmf not directly added to target

Hi all,

I need some help understanding how I can apply within-chain parallelization to the way I am currently calculating the log-likelihood. Specifically, what I am doing is this: I have N observation-pairs (y, option) where y is the number of people who chose that option in an ordered multiple-choice question.

The data covers multiple questions, so let’s say the first six of those N observation-pairs could look like this: (15, 1); (25, 2); (65, 3); and then moving on to the next question (35, 1); (45, 2); (25, 3).

Without parallelization, my code for the likelihood looks like this:

  for(n in 1:N){
    target += y[n] * ordered_logistic_lpmf( options[n] | ... );
  }

The logic being that each of these y persons all had a probability of choosing that option denoted by ordered_logistic_lpmf. So we can just add the result of ordered_logistic_lpmf to the target y times.

Looking at the reduce_sum tutorial, I can see that the parallelized version of this could look something like this in the function block:

real partial_sum_lpmf(int[] slice_y,
                        int start, int end,
                        int[] option, ...) {
    return ordered_logistic_lpmf( options[start:end] | ... );
  }

and this in the model block:

int grainsize = 1;
target += reduce_sum(partial_sum_lupmf, y, grainsize, option, ...);

But what do I do with the multiplication by y? Thanks!

You just need to specify your partial_sum_lpmf as a loop over slice_y:

real partial_sum_lpmf(int[] slice_y,
                        int start, int end,
                        int[] option, ...) {
  for (i in start:end) {
    return slice_y[i - start + 1] * ordered_logistic_lpmf( options[i] | ... );
  }
}

Right, I understand that logic, thank you. And I suppose you are indexing slice_y by [i - start + 1] whereas options is indexed by i, because:

slice_y will have indices from 1 to (end-start+1), whereas we need to select the relevant slice of options and any other argument by hand, by using the indices start through end.

Is that correct?

And: I can still use partial_sum_lupmf in the target += increment statement as normal, right?

slice_y will have indices from 1 to (end-start+1), whereas we need to select the relevant slice of options and any other argument by hand, by using the indices start through end.

Is that correct?

Yep

And: I can still use partial_sum_lupmf in the target += increment statement as normal, right?

Yes, though I’ve just realised that there’s an error in the code snippet I posted. The partial_sum_lpmf needs to accumulate the lpmf into a single real and then return that.

The correct snippet would be:

real partial_sum_lpmf(int[] slice_y,
                        int start, int end,
                        int[] option, ...) {
  vector[end - start + 1] result;
  for (i in start:end) {
    result[i - start + 1] = slice_y[i - start + 1] * ordered_logistic_lpmf( options[i] | ... );
  }
  return sum(result);
}
2 Likes