Writing an "Explainable" Bernoulli model?

mathDR · November 19, 2024, 2:00pm

Hello Stanimals! I have a generic question I am trying to solve.

I have a retail dataset that consists of client-item pairs and whether or not the client purchased the item (Bernoulli). Furthermore, I have k types of feedback from each purchase:

price loved, price okay, price too high
fit too small, fit okay, fit perfect
not my style, style okay, loved style,
etc.

the typical feedback is: poor, okay, great.

So my question/ask is: I would like to build an “explainable” model that not only tells me the client-item Bernoulli probability of sale, but why they might have/have not purchased it.

I was thinking something like:

target += Bernoulli(sold| \theta[client-item_s])
target += Bernoulli(not_sold|1-\theta[client-item_ns])
target += Dirichlet(\theta[client-item_s] | \prob_price_s, \prob_fit_s, \prob_style_s,...)
target += Dirichlet(1-\theta[client-item_ns] | \prob_price_ns, \prob_fit_ns, \prob_style_ns,...)
target += Bernoulli(feedback_price[client-item_s] | \prob_price_s)
target += Bernoulli(feedback_price[client-item_ns] | \prob_price_ns)
...

where the _s and _ns suffixes refer to “sold” and “not sold”, respectively. Here, I decomposed the data into sold and not sold events and partitioned the client-item pairings appropriately,

I am modeling the feedback above separately: that is, we expect for a non purchase that the feedback for price is either “poor” or “good” and we would bucket them accordingly. Similarly for the other feedbacks.

The full model will also learn embeddings for the client and items so as to allow for unseen items and clients – this will help estimate if a client would buy an item in the future.

My question is: does this look right? I can’t remember ever seeing a Dirichlet used like this before, but it kinda looks like a multivariate Beta-Binomial but trying to “explain” things.

Of course, there will be priors on the prob_price, prob_fit, ...

Any ideas would be super helpful. This will be a HUGE model to fit, so getting this generative part completed will go a long way to helping implement it.

Thanks!

Bob_Carpenter · November 19, 2024, 7:53pm

The pseudocode is inconsistent, so I’m not sure what you intend. For example, the Dirichlet distribution is over simplexes, but the Bernoulli distribution requires a probability, so I can’t figure out what you mean by these two lines:

target += Bernoulli(sold| \theta[client-item_s])
target += Dirichlet(\theta[client-item_s] | \prob_price_s, \prob_fit_s, \prob_style_s,...)

Also, the Dirichlet takes a single vector as the parameter, but you’re providing variadic arguments of unclear length.

I assume client-item_ns is not intended to be a subtraction even though that’s what you wrote?

Why isn’t the probability of not selling equal to one minus the probability of selling? I don’t get what you mean by these two lines:

target += Bernoulli(sold| \theta[client-item_s])
target += Bernoulli(not_sold|1-\theta[client-item_ns])

If you only have the feedback for purchased items, I don’t see how you can fit the model. Do you have any feedback on items that were not purchased?

Also, rather than treating each purchase decision as Bernoulli, don’t you want to link them somehow? If I’m buying a new phone, I don’t make a bunch of independent Bernoulli decisions, I make one categorical decision.

mathDR · November 19, 2024, 10:19pm

Apologies I realize I did not explain this very well. Let me try again:

I have data corresponding to each client item pair.
I have if it was sold (0 or 1)
regardless of if it was sold I have feedback corresponding to if

the price was too high or not
it fit correctly or not
the quality of the clothing was okay or not
the size was correct or not
the style was appropriate or not

I have a set of client features for each client and a set of features for each item.

I would like to determine both:
the probability that a client would purchase an item
“explain” why (or why not) they would purchase it.

For the first part I will use Bernoulli (since I want granular probabilities at the client item pairing). For the latter I want to determine “why” they buy it or not based on a “mixture” of probabilities

Like: the probability that client c will purchase item i is 62%, and that is made up of

price = 34%
size = 12%
quality = 40 %
fit = 8%
style = 6%

from that I surmise that it is a high quality item that is appropriately priced.

Upon fitting the model I should be able to estimate the probability of sale and the explanation for it for client item pairings that weren’t in the original dataset (provided the client and the item themselves were in the training set),

Does this seem reasonable?

Bob_Carpenter · December 10, 2024, 9:58pm

Sorry for not responding earlier—I just saw this now.

Did you mean something like a logistic regression based on the covariates you listed?

I don’t know how you’re going to get the contributions in the form of probabilities like this. If you use a logistic regression, as would be standard here, this isn’t quite the right interpretation because inv_logit(a + b) != inv_logit(a) + inv_logit(b).

mathDR · December 11, 2024, 9:16pm

Yeah the more I think about it the more I need to rethink about it!

Topic		Replies	Views
Simple Bernoulli hierarchical model: Am I doing this right? Modeling	3	1344	February 27, 2018
Bayesian categorical annotation Modeling	4	815	January 11, 2019
Specify product bernouli prior for selecting covariates into the model Modeling	2	174	January 9, 2024
Modelling a three binary events: A, B and T. A and B are observed directly, whereas T is only observed as if A+B occured Modeling	5	463	April 9, 2023
Modeling with bernoulli or binomial distribution General brms	3	783	September 16, 2022

Writing an "Explainable" Bernoulli model?

Related topics