# Using _lupmf for multivariate likelihood in reduce_sum

Hello all. So I see that Stan doesn’t allow for usage of `_lupmf` functions outside of the `model` block or outside of a user defined probability distribution function.

My issue is that I am trying to use `multinomial_lupmf` in a `reduce_sum` but I need to pass in an `int [,]` as my first parameter (to be able to loop over the integer observables), but any user defined `_lpmf` function needs type `int []`.

Is there a way around this?

My example follows:

``````real partial_sum_lpmf(
int[ , ] selected_slice,
int start,
int end,
matrix eta
)
{
real ret_val = 0;
for (n in start:end) {
ret_val += multinomial_lupmf(selected_slice[n-start+1] | softmax(eta[n]'));
}

return ret_val;
}
``````

and I would call it in the `model` block as
`target += reduce_sum_lpmf(partial_sum, observed, grainsize, eta)`

Hey,

you need to call it as

``````target += reduce_sum(partial_sum_lupmf, observed, grainsize, eta);
``````

and I would just a dummy `int[]` for the slice argument and pass in the int[,] as some of the other arguments.

1 Like

Sorry yeah that was a typo (how I called it).

Thanks for the tip! Also, I can pass it in as `data` does that help speed anything up (in terms of the autodiff tape?)

Do you mean for the dummy variable? For any unused args it makes no difference in terms of the AD tape. It also does not make a difference for int, int[], int[,], … as those are never treated as autodiff.

1 Like

Could you clarify what this code would look like? I don’t fully understand the example. Thanks.

Yes, the more variables you make primitives (aka data), the faster the autodiff will be. The general rule is that transformed parameters that only depend on data and transformed data should be defined in the transformed data block.

1 Like

I’m trying to implement this example and I’m getting a weird error telling me I am missing a parentheses:

``````  real partial_sum_lpmf(int[,] selected_slice, int start, int end, matrix ptilde ){
real ret_val = 0 ;
for (n in start:end) {
ret_val += multinomial_lupmf( selected_slice[n-start+1] | ptilde[n] ) ;
}
return (ret_val) ;
}
``````
``````SYNTAX ERROR, MESSAGE(S) FROM PARSER:
error in 'model213066584665_space_partitioning_SDM_v4' at line 5, column 34
-------------------------------------------------
3:     real ret_val = 0;
4:     for (n in start:end) {
5:       ret_val += multinomial_lupmf(selected_slice[n-start+1] | ptilde[n]) ;
^
6:     }
-------------------------------------------------

PARSER EXPECTED: "("
Error in stanc(file = file, model_code = model_code, model_name = model_name,  :
failed to parse Stan model 'space partitioning SDM v4' due to the above error.
``````

Using the latest cmdstanr (2.30), I get this more informative error message:

``````     3:      real ret_val = 0 ;
4:      for (n in start:end) {
5:        ret_val += multinomial_lupmf(selected_slice[n-start+1] | ptilde[n] ) ;
^
6:      }
7:      return (ret_val) ;
-------------------------------------------------

Ill-typed arguments supplied to function 'multinomial_lupmf':
(array[] int, row_vector)
Available signatures:
(array[] int, vector) => real
The second argument must be vector but got row_vector
``````

The fix is to transpose (I’ve also converted to our standard formatting).

``````functions {
real partial_sum_lpmf(int[ , ] slice, int start, int end, matrix ptilde) {
real ret_val = 0 ;
for (n in start:end) {
ret_val += multinomial_lupmf(slice[n - start + 1] | ptilde[n]') ;
}
return ret_val;
}
}
``````

If possible, declaring `ptilde` as an array of vectors rather than a matrix would be more efficient because it doesn’t require the transposition.

It’s too bad our `multinomial_lupmf` isn’t vectorized yet, or this could be coded as a one-liner.

1 Like