Fitting a multivariate Bernoulli distribution

A_B · August 7, 2022, 5:02pm

Hi,
I am new to Stan and have the following question.

I have n realizations of 8 Bernoulli random variables X_1,...,X_8. I want to estimate the pointwise mutual information, pmi_{ij} = \log\frac{P(X_i=1, \ X_j=1)}{P(X_i = 1)P(X_j=1)} for each pair, i.e. with i,j \in 1,...,8 and i \ne j. So I need to estimate the marginal and pairwise joint probabilities.

So far, I have two ideas:

Create a matrix with 8 columns for the marginal responses, and a matrix with 28 columns for the joint responses (28 possible pairs). Then, estimate a the probability of response for each column using likelihood x ~ bernoulli(p), with some prior on p.

Create a matrix with 256 columns, where each column is a indicator variable for one specific combination of X_1,...,X_8. The proceed as above, and sum the appropriate p’s to get the marginal and pairwise joint probabilities.

The first approach seems preferable (since I don’t care about the joint probability of more than 2 variables). Is there any reason to prefer the second approach? Or is there a better approach alltogether?

Thank you.

Bob_Carpenter · August 26, 2022, 9:16pm

Sorry your query got missed earlier, A_B.

If what you want to compute is pmi[i, j] then you only need the 28 pairs. But what you will get in Stan if you do this in the generated quantities block is a posterior distribution of pmi[i, j] values.

An alternative would be to run Stan and get point estimates of the probabilities, then plug those in on the outside. They won’t be the same as they’re not linear functions.

It doesn’t matter what your model is for \textrm{Pr}[X_i = 1] in this scenario. You can just set them up as simple independent Bernoullis or you could build a more sophisticated model that deals with correlation among them.

Topic		Replies	Views
Multivariate bernoulli model Modeling specification	6	1086	February 26, 2021
Truncated DP for p-dimensional Bernoulli mixtures Modeling	0	344	February 3, 2019
Mixture of Multivariate Bernouilli Modeling	5	879	September 29, 2019
Integer array in parameters block Modeling discrete-parameters	8	657	April 26, 2023
Modelling the indicator that a sum of Bernoulli random variables is positive Modeling	2	384	January 3, 2021

Fitting a multivariate Bernoulli distribution

Related topics