Model with two linked outcome variables

ktrutmann · July 20, 2022, 2:46pm

Hey everyone!

I would appreciate your help in a somewhat conceptual question.
I am planing to model the data of a (psychological) reinforcement learning experiment.
Participants can always choose between three options (left, middle or right), which either pay them nothing or a pre-specified reward (e.g. 5 points if you choose left and win, 10 points if you chose the middle and win etc.).
Through their choices they start to learn the wining-probabilities of each option.

Now, usually one would model the “choice value” of each option and derive the choice probabilities from that.
However, in each round we also asked our participants to estimate the wining probability of each option.
That means as outcome variables we have the participants’ probability estimates as well as their choices.
I would like to incorporate both of these datapoints into a model, and this is what I had in mind:

Use reinforcement learning to model their reported probability estimates Q estimating a learning rate \eta:

Q \sim RL(\eta)

In parallel I would fit their choices.
In the simplest case they would just calculate the expected value, EV, of each option by multiplying the wining-value of the option with the corresponding estimated wining-probability Q.
Using a softmax function I translate EV into actual choice probabilities (for now leaving out any sensitivity parameters for simplicity sake).
So their choice would be modeled as

X \sim \text{softmax}(EV)

Now it would be easy to just plop two target += statements (one for the modeled probability estimates and one for the modeled choices) in my model section in Stan and call it a day.
In my head this also makes sense, since adding the logs means multiplying the probability densities, which seems akin to asking the question “given these parameters, what is the probability of this winning estimate and this choice?”.
However, I’m not sure whether there is something I’m not seeing.
The two outcome variables are clearly linked, since the predicted choices would be based on the predicted probability estimates.
Would I have to link the two outcomes via a mixed outcome copula?
I must admit, I’d have to wrap my head around that topic first, which is why I’m asking.

I’m thankful for any comments or help!
Cheers

Bob_Carpenter · July 26, 2022, 9:42pm

I’m afraid reinforcement learning is so general as to not mean a whole lot on its own. Is there a specific density you have in mind here?

Yes, that’s right, because

\log p(b) + \log p(a \mid b) = \log p(a, b).

The unknown unknowns are what get you :-).

Topic		Replies	Views
Multiple likelihoods Modeling	3	944	October 13, 2021
Assistance interpreting user's manual recommendations for using softmax function for hierarchical reinforcement learning model Modeling	2	673	December 16, 2020
Two-armed bandit hierarchical reinforcement learning model - interpreting conflicting loo and posterior predictive check results Modeling specification , loo , posterior-predictive , hierarchical-model , reinforcement-learning	7	648	January 17, 2024
Outcome contest model - conflict between predictions and parameter estimates Modeling cognitive-science	3	767	May 9, 2022
Blog post about a Q-learning model of reinforcement learning (RL) Publicity example-models , cognitive-science	0	1045	December 6, 2021

Model with two linked outcome variables

Related topics