Hierarchical logistic regression correlation between regression weights

Dear all,

I have fitted a hierarchical regression model with brms.

My data structure is: ~380 human participants, each of which provides ~600 measurements (one of two button presses). I have different factors (e.g. ‘Cost’ or ‘OfferValue’) to predict those choices. I have z-scored the regressors across the data from all participants (rather than scored within single participants)

Here is an example of the model (I’m new to brms/ lme4 formulas…)

`````` fit <- brm(formula = ForageChoice ~ 1+OfferValue+Cost+(1+OfferValue+Cost | ID),
data = df, family = 'bernoulli',
warmup = 1000, iter = 2000, chains = 4,cores=4)
``````

The model fits (all Rhat<1.1 - I have not yet checked other measures)

But looking at the correlations, the regression weights are very correlated (some > 0.9). The regressors that I put in are not correlated. I think this might be because some participants basically don’t do the task well and then don’t take any of the factors that should affect their choices into account. So in a way, each person should have a factor that says how well they do the task (their own ‘noisiness’), independent from how much (relatively) they use each factor.

I don’t know whether there is anyway to code this in a regression (in brms)?

It is not very clear which correlation do you mean? What exactly did you correlate with what?

I don’t think this is straightforward in a bernoulli model, but I think you might get better results by having the response be not the actual choice they made, but whether the participants choose the “optimal” or “better” or “correct” choice (assuming this makes sense in your context). Than those that do well would be those whose predicted probability for “optimal” choice is high and you don’t need any concept of noiseness.

the regression weights are very correlated

It is not very clear which correlation do you mean? What exactly did you correlate with what?

Sorry, what I meant was that in the output of brms, it tells me the group level effects (e.g. sd(OfferValue)), including the correlations between those, e.g. cor(OfferValue,Cost). And these correlations are very high.

I don’t think this is straightforward in a bernoulli model, but I think you might get better results by having the response be not the actual choice they made, but whether the participants choose the “optimal” or “better” or “correct” choice (assuming this makes sense in your context). Than those that do well would be those whose predicted probability for “optimal” choice is high and you don’t need any concept of noiseness.

This is a clever suggestion! I’ll try this. It does change the ‘meaning’ of the regression weights, but maybe this is ok if it solves the other problem. I need to think this through a bit more.

Just to update this. I have now coded the model in Stan and use a decision model like this:

Forage choice ~ inv_logit(decision_noise * sum(myRegressors*regressionWeights))

[now setting one regression weight to 1 to not over-parameterise because of the added ‘decision noise’ parameter]

Now neither the decision_noise nor any of the regression weights are nearly as highly correlated as before (all r<0.5 and many <0.3, when before they were >0.8).