Switch from categorical_logit to bernoulli_logit model

I have a reinforcement learning model with 2 choices per trial (coded as 1 or 2) for which I calculate expectancy values (i.e. 1 value for each of the two choices) per trial to predict the choice.
I have implemented this (extending a model from hBayesDM package) using categorical_logit() to which I can input my 2 expectancy values (stored in the vector[2] ev;) and also calculate the log likelihood and predicted choice:

model {
  ...
   choice[i, t, j] ~ categorical_logit(tau[i, j] * ev);
  ...
} 

generated quantities {
  real y_pred[nS, nC, nT_max];
  real PE_pred[nS, nC, nT_max];
  real ev_pred[nS, nC, nT_max, 2];
  ...
  // compute log likelihood of current trial
  log_lik[i, j] += categorical_logit_lpmf(choice[i, t, j] | tau[i, j] * ev);
  
  // generate posterior prediction for current trial
  y_pred[i, j, t] = categorical_rng(softmax(tau[i, j] * ev));
  ...
}

However, one of the choices is always associated with a higher chance to obtain a reward (“correct choice”), thus, I thought to change this to a bernoulli_logit() model.
I recoded my choice to be 1 for correct choice and 0 for incorrect choice.
My issue, however, is which value to input to bernoulli_logit(). I understand that categorical_logit() takes a vector with one real number for each of n categories (is this the correct interpretation of \theta \in \mathbb{R}^{N}?) that is transformed using softmax(\beta). So I can input my vector[2] ev of expectancy values multiplied by an inverse temperature value tau to predict the choice.
The documentation (Stan function reference 12.2) states that bernoulli_logit() takes a “chance-of-success parameter” \alpha where \alpha \in \mathbb{R}. Thus, I understand that this is a single value which is not necessarily a probability value (i.e. \in [0, 1])}. But I cannot just use the expectancy value for the “correct choice” (i.e. ev[1]) since I must set this in relation to the expectancy value of the “incorrect choice” (ev[2]).
So one option would be to calculate the vector[2] softmax_ev = softmax(tau*ev); and use real alpha = softmax_ev[1] as input for bernoulli_logit().
Another option would be to calculate a difference between the 2 expectancy values (real diff_ev = ev[1]-ev[2];) and then use this as input to bernoulli_logit(). With this option I’m not sure how to include my inverse temperature balue tau…

I hope I could explain my question intelligibly! I am happy to provide more information if necessary!
Any help of how solve this and/or any explanation how to best model such a choice prediction is highly appreciated!
Thanks in advance!

1 Like

softmax(tau*ev)[1] is equal to inv_logit(tau*ev[1] - tau*ev[2]).
You can just multiply diff_ev by the inverse temperature.

real diff_ev = ev[1] - ev[2];
choice ~ bernoulli_logit(tau*diff_ev);

Thank you, @nhuurre!
So does that mean that softmax(tau*ev)[1] is also equivalent to tau*diff_ev?

It’s equal to inv_logit(tau*diff_ev). This code is the same as above:

real softmax_ev = softmax(tau*ev);
choice ~ bernoulli(softmax_ev[1]); // no logit here, softmax did it already

Ah, ok, thanks for clarification! That makes sense to me!

Dear @nhuurre,
I have one more question, just to be sure:
In the case of only 2 expressions of my categorical variable choice, does

real diff_ev = ev[1] - ev[2];
choice ~ bernoulli_logit(tau*diff_ev);

model the same as

choice ~ categorical_logit(tau * ev);

with the only difference being that the choice variable is coded as 1 and 0 in the former and as 1 and 2 in the latter case?

1 Like

Yes, that’s correct.

1 Like

Great! Thank you very much!!