How can I simulate data for my model?(Reinforcement Learning Model)

xhb120633 · November 15, 2019, 12:44pm

my intended stan code is like below:

data {
  int<lower=1,upper=39> nSubjects;
  int<lower=1,upper=100> nTrials;
  int<lower=1,upper=4> choice[nSubjects, nTrials];     
  real<lower=-1150, upper=100> reward[nSubjects, nTrials]; 
  matrix [nSubjects, nTrials] NumA[nSubjects, nTrials];
  matrix [nSubjects, nTrials] NumB[nSubjects, nTrials];
  matrix [nSubjects, nTrials] NumC[nSubjects, nTrials];
  matrix [nSubjects, nTrials] NumD[nSubjects, nTrials];
}

transformed data {
  vector[4] initV;  // initial values for V
  vector[4] initu;
  vector[1] initA;
  vector[1] initB;
  vector[1] initC;
  vector[1] initD;
  initV = rep_vector(0.0, 4);
  initu = rep_vector(0.0, 4);
  initA = rep_vector(0.0, 1);
  initB = rep_vector(0.0, 1);
  initC = rep_vector(0.0, 1);
  initD = rep_vector(0.0, 1);
}

parameters {
  // real<lower=0,upper=1>lr_mu;
  // real<lower=0,upper=1>tau_mu;
  
  // real<lower=0>lr_sd;
  // real<lower=0>tau_sd;
  
  real<lower=0,upper=1>   lr[nSubjects];
  real<lower=0,upper=2>   c[nSubjects];
  real<lower=0,upper=1>   A[nSubjects];
  real<lower=0,upper=5>   loss_aversion[nSubjects];
}

model {
  // lr_sd  ~  cauchy(0,1);
  /// tau_sd ~  cauchy(0,3);
  // lr     ~  normal(lr_mu,lr_sd);
  // tau    ~  normal(tau_mu,tau_sd);
 
}

generated quantities {
  matrix [nSubjects,nTrials]choice_sim[nSubjects,nTrials];
  matrix [nSubjects,nTrials]reward_sim[nSubjects,nTrials];
  for (s in 1:nSubjects) {
    vector[4] v; 
    vector[4] U;
    real pe;
    real theta;
    v     = initV;
    U     = initu;
    
    for (t in 1:nTrials) {
      // complete the foo-loop here
      theta = pow(3,c[s])-1;
      choice[s,t] = softmax(theta*v);
      
      if(choice[s,t] == 1){
        reward[s,t] = IGT_netincome[1,NumA];
        NumA = NumA+1;
      }else if(choice[s,t] == 2){
        reward[s,t] = IGT_netincome[2,NumB];
        NumB = NumB+1;
      }else if(choice[s,t] == 3){
        reward[s,t] = IGT_netincome[3,NumC];
        NumC = NumC+1;
      }else{
        reward[s,t] = IGT_netincome[4,NumD];
        NumD = NumD+1;
      }
         
      if (reward[s,t]>=0) {
        U[choice[s,t]] = pow(reward[s,t],A[s]);
        }else {
        U[choice[s,t]] = -loss_aversion[s]*(pow(-reward[s,t],A[s]));}
      v[choice[s,t]]=v[choice[s,t]]+lr[s]*pe;
      pe = U[choice[s,t]]-v[choice[s,t]];
        
  }
 }
}

But I found it not applicable. Though it’s a model based on Rescolar-wagnar reinforcement learning model, it’s under the context of Iowa Gambling Task. In brief, I would like to produce choice behaviour based on RL model, however the choice should be interger from 1 to 4 because they can only select among the four given decks. The reward is hidden beneath the card and you will get a reward after picking a deck. Their relationship is stored in a matrix which means certain choices produce certain reward.
And as you know, the reward may be high or low, positive or negative, and it impacts the next trial of choice probability. The probability is described by softmax rules.

So How can I simulate such a dataset because I am in a mess between model fitting and parameter estimation and model simulation. If you have specifc suggestions on code above, I would really appreciate it! Thank you in advcance!

charlesm93 · November 15, 2019, 5:28pm

Can you run your model using “algorithm=fixed” and then extract the generated quantities for one iteration?

What you do you mean your model is “not applicable”?

Guido_Biele · November 15, 2019, 7:24pm

It is not 100% clear for me what this model is doing.
For this to be a reinforcement learning model, the model sections needs code to update reward expectations and to make a choice given the reward expectations. The model has neither.

Even if your goal is to ultimately fit IGT data, you could start by writing a model/simulating data for a simpler situation with only two options.

Also note that other’s have fit IGT data with Stan before. You could search for that and take it as a starting point. (Though these models are oftentimes more complex than a simple Rescorla Wagner model.

msspektor · November 16, 2019, 9:46pm

You should take a look at the IGT models of hBayesDM: http://ccs-lab.github.io/hBayesDM/reference/index.html

You can use them as starting points for working hierarchical models of the IGT. You should be able to get your intended model by simplifying them.

xhb120633 · November 17, 2019, 3:14am

unfortunately, on Github, the codes relevant to IGT are not availabe due to the 404 not found error.

msspektor · November 17, 2019, 9:05am

Alternatively, download the package in R and then look through the package contents.

frank_hezemans · November 17, 2019, 11:49pm

Have you checked out the hBayesDM core files directory on GitHub? All Stan code is in the stan_files folder, and documentation is in the models folder (e.g. this IGT model is showing up fine for me).

Guido_Biele · November 18, 2019, 2:35pm

OK, my first answer was a bit fast, because you actually have lots of what you need in the generated quantities.

What you seem to be missing is to update the thetas. You define the prediction error here:

Bu you are not updating theta after that. You need something like

theta[choice[s,t]] = theta[choice[s,t]] + alpha * pe;

where alpha is the learning rate, which should have a upper and lower bound of 0 and 1, and which can use the beta distribution as a prior. (I can’t see the learning rate in your parameter block, but maybe that’s my unfamiliarity with your code.)

Lastly, It looks like you are trying to implement something along the lines of prospect theory for the utility function. I would first try a simpler model without a complex utility function, and move only on to something more complicated after you successfully recovered parameters of the simpler model.

Topic		Replies	Views
Simulating fake data for regression in Stan Modeling techniques , specification , performance	2	1330	October 9, 2021
Asking for advice on making a more efficient RL code Modeling	4	515	July 17, 2021
Stan code for fitting simple RL model Modeling cognitive-science	2	1557	December 12, 2018
Problem Simulating Data with Generated Quantities (Dimension mismatch in assignment; type = real; right-hand side type = real[ ]) RStan rstan , techniques , specification	3	860	April 29, 2021
Bugs from A dynamic reinforcement learning model Modeling cognitive-science	7	658	January 6, 2020

How can I simulate data for my model?(Reinforcement Learning Model)

Related topics