# Custom Discrete Probability Mass Function

I have a fairly basic modelling question.
I am fitting a mixture model to some ordinal data reflecting confidence ratings pertaining to a judgment. One component of the mixture is a typical ordinal regression model. However, the other component is meant to reflect trials on which participants always respond with maximal confidence (in my case, 6). The particular details are unimportant, as my question is limited to implementing the probability mass function for the latter component. I believe I have done so correctly, but as this is my first time doing this, I wanted to check. I defined that function as:

real rec_dist_lpmf(int y, real mu) {
if(y == 6)
{
return(log(.9995));
} else {
return(log(.0001));
}
}

(This is being used within brms, which is why ‘mu’ is included as an argument)

This effectively states that should y = 6, return approx. log(1), whereas should y = 1 to 5, return approx. log(0). I could not return precisely log(0) as it would become -Inf. Is this correct, or is there a more efficient means of doing this? My concern is that it feels arbitrary that I am assigning .0001 probability to values 1 through 5.

(Minor edit to add: this pertains to the model discussed in this post)

Cheers!
Jon

1 Like

I suppose that if there is a 0.0001 probability of each of y = 1, 2, 3, 4, 5, then this function defines the logarithm of a valid PMF. However, I don’t see how it would make sense to use something like this to estimate the parameters of a generative model.

Thanks for the response, Ben!
Is there a more principled way to model a mixture where one component entails always making the highest response (chosen from an ordinal set)?

I am not sure whether it is helpful, but I am using this lpmf as part of an attempt to fit a variant of the “hierarchical dual-process model” described on p. 39 and p. 45 of this article. The model assumes that some trials derive from a process approximated by an ordinal regression and for other trials participants will simply respond with the highest confidence rating available. The provided function is meant to reflect the latter.