Valid contrasts of 2x2 categorical variables in linear regression of reinforcement learning parameters

lsands · January 20, 2023, 4:37pm

Hi all,

I am trying to figure out the best way to code and perform inference on a linear regression model that includes four categorical variables: two groups (i.e., disease groups) and two conditions (i.e., on and off medication). More specifically, I am working with a reinforcement learning model where each RL model parameter is the dependent variable of this linear regression. I’m attempting to model the effect of each group and condition on the RL model parameters without splitting the data out into 4 group/condition combination-specific subsets.

There are 60 datasets corresponding to 30 participants, each of whom completed a task twice (once per medication condition); there are 18 participants in group 1 and 12 participants in group 2. My current draft of this model uses dummy coding and I’m estimating varying slopes for each group/condition combination. The columns of the variable ‘ICD_Med_idx’ correspond to (i) group1/on Meds (slope captured by beta1), (ii) group1/off Meds (beta2), (iii) group2/on Meds (beta3), and (iv) group2/off Meds (beta4).

My main questions are:

Is this a valid way of modeling the influence of each group/condition on the RL model parameters?

Embedding the linear regressions within each RL model parameter’s non-centered parameterization is stretching the limits of my understanding of the best practices for coding this model.

My intuition about performing inference on the beta coefficients (separately for each RL model parameter) would be:

contrast (beta1-beta2) to get the effect of medication state within group1
contrast (beta3-beta4) to get the effect of medication state within group2
contrast (beta1-beta3) to get the effect of diagnosis while on medication
contrast (beta3-beta4) to get the effect of diagnosis while off medication
Is this intuition on-target? or is there a more proper way to perform these contrasts?

What would be the best way of contrasting the beta coefficients to get the total effect of medication (i.e., ‘ignoring’ diagnosis) or of diagnosis (i.e., ‘ignoring’ medication state)?

Here, my intuition about isolating the effect of medication state across groups (for each RL parameter separately) would be (beta1+beta2)-(beta3+beta4), and isolating the effect of group diagnosis across medication states would be (beta1+beta3)-(beta2+beta4).

Is there a more proper way to code the categorical variables to model their interactions?

Fitting this model as-is gives fine chain convergence/effective sample size of all model parameters (via 3 chains with 12k samples each (2k warmup)). I’d greatly appreciate any clarification on something I’ve missed/gotten wrong in the model specification or the inferential intuition, or confirmation that this modeling approach makes sense.

data {
  int<lower=1> N;
  int<lower=1> T_max;
  int<lower=1, upper=T_max> T_subjs[N];
  int<lower=-1, upper=8> option1[N, T_max];
  int<lower=-1, upper=8> option2[N, T_max];
  int<lower=-1, upper=2> choice[N, T_max];
  real outcome[N, T_max];

  int<lower=0, upper=1> ICD_Med_idx[N,4];
}

transformed data {
  row_vector[3] initV;
  initV = rep_row_vector(0.0, 3);
}

parameters {

  vector[3] mu_pr;
  vector<lower=0>[3] sigma_pr;

  vector[N] learnrate_pr;
  vector[N] discount_pr;
  vector[N] tau_pr;

  vector[3] beta1;
  vector[3] beta2;
  vector[3] beta3;
  vector[3] beta4;
}

transformed parameters {
  // subject-level parameters
  vector<lower=0, upper=1>[N] learnrate;
  vector<lower=0, upper=1>[N] discount;
  vector<lower=0, upper=20>[N] tau;
  
  for (i in 1:N) {
    learnrate[i] = Phi_approx(mu_pr[1] + beta1[1]*ICD_Med_idx[i,1] + beta2[1]*ICD_Med_idx[i,2] + beta3[1]*ICD_Med_idx[i,3] + beta4[1]*ICD_Med_idx[i,4] + sigma_pr[1]*learnrate_pr[i]);
    discount[i]  = Phi_approx(mu_pr[2] + beta1[2]*ICD_Med_idx[i,1] + beta2[2]*ICD_Med_idx[i,2] + beta3[2]*ICD_Med_idx[i,3] + beta4[2]*ICD_Med_idx[i,4] + sigma_pr[2]*discount_pr[i]);
    tau[i]       = Phi_approx(mu_pr[3] + beta1[3]*ICD_Med_idx[i,1] + beta2[3]*ICD_Med_idx[i,2] + beta3[3]*ICD_Med_idx[i,4] + beta4[3]*ICD_Med_idx[i,4] + sigma_pr[3]*tau_pr[i])*20;
  }
}

model {

  mu_pr ~ normal(0,1);
  sigma_pr ~ normal(0,1);

  learnrate_pr ~ normal(0,1);
  discount_pr  ~ normal(0,1);
  tau_pr       ~ normal(0,1);

  to_vector(beta1) ~ std_normal();
  to_vector(beta2) ~ std_normal();
  to_vector(beta3) ~ std_normal();
  to_vector(beta4) ~ std_normal();

  // subject loop and trial loop
  for (i in 1:N) {
    matrix[6,3] ev;
    int decision;
    real PE;
    vector[2] option_values = [ 0, 0 ]'; 			

    for (idx in 1:6) { ev[idx] = initV; }

    for (t in 1:T_subjs[i]) {

      decision = (choice[i, t] > 1) ? option2[i, t] : option1[i, t];

      option_values = [ ev[option1[i, t], 1] , ev[option2[i, t], 1] ]';
   
      // compute action probabilities
      target += categorical_lpmf(choice[i, t] | softmax( option_values*tau[i] ));

      for (e in 1:3) {
	if (e < 3) {
	  PE = discount[i] * ev[decision, e+1] - ev[decision, e];
          ev[decision, e] += learnrate[i] * PE;
	} else {
	  PE = outcome[i, t] - ev[decision, e];
          ev[decision, e] += learnrate[i] * PE;
	}
      }
    }
  }
}

Vanessa_Brown · January 20, 2023, 5:13pm

To think through these models, can be helpful to separately sketch out what you’re trying to do with the model, how you would set up the linear regression, and then combine them.

For each parameter you have a group mean and variance. If you’re using non-centered parameterization, then you have the mean, variance, and the N-length vector of individual variations.
for the regression, you have a within-subjects dummy coded variable (on/off meds) and a between-subjects dummy coded variable (group 1/2), plus the interaction of these two variables. The assumption is that each of these variables and their interaction changes the mean.

Taking these together, to form the non-centered parameter for each person you would add: the group mean for this parameter (mu_pr[param] in your model), beta1 (representing the effects of meds) times the dummy coded variable for med status, beta2 (representing group membership) times that dummy coded variable, beta3 (interaction) times both dummy coded variables, and the overall variance times the individual variation parameter.

Also, you may be able to run fewer than 10K samples and to have a faster-running model with some model tweaks. Right now you have no priors on your variances, and you could drop the phi_approx*20 transformation for tau and instead constrain the mean for that parameter to be >0.

lsands · January 20, 2023, 6:30pm

Thank you for the response, Vanessa, that all makes sense to me

Could you clarify what you mean by not having priors over the variances for the non-centered parameterization? Do you mean something like more informative priors than standard normals for the group-level ‘sigma_pr’ variables that are multiplied by the individual variations (and more informative priors on the individual variations?)?

Vanessa_Brown · January 20, 2023, 6:50pm

ah sorry, I missed the sigma_pr, nevermind!

Topic		Replies	Views
Extend hierarchical reinforcement learning model for multiple conditions Modeling hierarchical-model , reparametrization	8	1610	April 21, 2021
Two-armed bandit hierarchical reinforcement learning model - simplify to random intercept only Modeling rstan , specification , mixed-model , reinforcement-learning	3	474	February 12, 2024
Interpret parameters after inv_logit transform in hierarchical non-centered parameterization Modeling interpret-results	1	568	May 11, 2021
Repeated measure hierarchical reinforcement learning with groups (2 x 2 x 2 design) Modeling rstan , fitting-issues , hierarchical-model , model-comparison , cognitive-science	7	2015	July 28, 2020
The 'right' way to convert a specific group into a population-level parameter Modeling brms	12	619	June 27, 2023

Valid contrasts of 2x2 categorical variables in linear regression of reinforcement learning parameters

Related topics