How to code correlations between Bernoulli trials?

JohnDoe · September 5, 2021, 10:59pm

Let’s say that p persons are throwing a basketball into a set of b baskets which have different radii and the outcome, Spb is binary, indicating whether the basket was scored or not. Let ap be person p’s throwing accuracy and rb the radius of the bth basket. The model could be formulated in Stan as such:

model{
S[p,b] ~ bernoulli_logit(a[p] + r[b])
}

However, what if we wanted to include correlations between trials? Let’s say that scoring a basket gives you a confidence boost which in turn increases your probability of scoring the next hits. Likewise, failing to score a basket makes you more anxious and decreases your chances of subsequent scores. So if Sp1 = 1, the chances that Sp2 or Sp3 will also be 1 are increased.

How could we code these correlations in Stan? If I was dealing with continuous variables, I guess I would use the multi_normal_lpdf function and pass the covariances between trials via the Sigma argument, but I don’t see a way to do that with binary variables since the bernoulli_logit or other related functions don’t have arguments related to the covariance structure.

Furthermore, I don’t have a feeling that simply using the binomial distribution would estimate these correlations. I may be incorrect with this, though. Any help would be appreciated.

sonicking · September 6, 2021, 12:29am

The easiest is to do a multi-level model with a random intercept across people. This will introduce correlation.

jsocolar · September 6, 2021, 12:48am

It sounds like you want some sort of autoregressive structure on the outcome. In most glms, we can introduce an autoregressive structure on the link scale by estimating observation-specific random effects that are modeled with an autoregressive structure. The challenge is that in Bernoulli models, the standard deviation of observation-specific residuals (where the autoregressive structure would show up) are not identified. There are at least two options:

One option is to fix the standard deviation a priori, but this choice will likely seem quite arbitrary.
Another option is to fit an auto-logistic model, where the outcome of the previous shot is included as a covariate on the outcome of the current shot.

Also note that in your formulation of the model:

you presumably want to include a coefficient (i.e. a slope term) that multiplies the radius covariate r[b].

Topic		Replies	Views
Specify product bernouli prior for selecting covariates into the model Modeling	2	174	January 9, 2024
Modelling the indicator that a sum of Bernoulli random variables is positive Modeling	2	383	January 3, 2021
Fitting a multivariate Bernoulli distribution Modeling specification	1	628	August 26, 2022
Assigning bernoulli prior to missing entries in covariate Modeling	2	1056	August 15, 2019
Bivariate logistic model with correlated random effects and missing data Modeling specification , performance	11	1751	January 30, 2018

How to code correlations between Bernoulli trials?

Related topics