Hi all,
I have a basic question. I was playing with [RA-prospect theory model](https://github.com/CCS-Lab/hBayesDM/blob/master/commons/stan_files/ra_prospect.stan, code and data follow) and I’m getting a correlation between my fitted parameters (see Fig.
). I have no or very little experience with this, so I would have several questions and would be very grateful for advice and explanations:- I have been reading other posts (eg., Infinite location parameter warning and correlated parameters, Reducing correlation between parameters of the posterior distribution, …) but I guess I need a bit more explanation and help.
What I am looking at and what I’m plotting here are the correlations of means of the estimated parameters. A naive question is, is this the same correlation as which would be between the parameters (ie. if I take the distribution of parameters for each subject and look at correlations within a subject)? Why/why not? Where would be the difference?
Second question would be, what does it mean if the correlation gets more significant if I include more samples? See Fig.
Btw. I’m seeing that eg. \rho and \tau have nice and clear exponential dependency. See points bellow. The plots are just illustrative.
- What was proposed by several people were three things:
A) reparametrise the equation so that the correlation is removed (do some math transformation, as in the first post) – that I am unable to do since the equation I’m fitting is quite complex:
sv = \text{gain}^\rho - \lambda \cdot \text{loss}^\rho,
followed by:
y = \frac{e^{\tau(sv-c^\rho)}}{1+e^{\tau(sv-c^\rho)}},
where parameters are \tau inverse temperature, \lambda loss aversion coefficient, \rho risk aversion coefficient; and c being some given constant, and gains, losses the inputs.
B) To try to estimate the correlation within STAN and then account for it. Here I’d have two questions – how to do that (in STAN, pySTAN) and if I have it, how to use it – what will it tell me and how should I interpret it?
C) To try to use different priors, eg. multinominal/multigaussian. Here I’d have also two questions – how do I select my priors? How should I create them? And then how and why does it work? If I have any correlation of the parameters left after this, does that mean that actually the parameters are correlated or does it still mean that maybe my model is wrong?
Note: I’m checking convergence of the parameters (n_eff, Rhat, divergence, tree_depth), all are ok.
The stan code I’m using is the following (taken from the page mentioned before):
data {
int<lower=1> N;
int<lower=1> T;
int<lower=1, upper=T> Tsubj[N];
real<lower=0> gain[N, T];
real<lower=0> loss[N, T]; // absolute loss amount
real cert[N, T];
int<lower=-1, upper=1> gamble[N, T];
}
transformed data {
}
parameters {
vector[3] mu_pr;
vector<lower=0>[3] sigma;
vector[N] rho_pr;
vector[N] lambda_pr;
vector[N] tau_pr;
}
transformed parameters {
vector<lower=0, upper=2>[N] rho;
vector<lower=0, upper=5>[N] lambda;
vector<lower=0, upper=30>[N] tau;
for (i in 1:N) {
rho[i] = Phi_approx(mu_pr[1] + sigma[1] * rho_pr[i]) * 2;
lambda[i] = Phi_approx(mu_pr[2] + sigma[2] * lambda_pr[i]) * 5;
tau[i] = Phi_approx(mu_pr[3] + sigma[3] * tau_pr[i]) * 30;
}
}
model {
// ra_prospect: Original model in Soko-Hessner et al 2009 PNAS
// hyper parameters
mu_pr ~ normal(0, 1.0);
sigma ~ normal(0, 0.2);
// individual parameters w/ Matt trick
rho_pr ~ normal(0, 1.0);
lambda_pr ~ normal(0, 1.0);
tau_pr ~ normal(0, 1.0);
for (i in 1:N) {
for (t in 1:Tsubj[i]) {
real evSafe; // evSafe, evGamble, pGamble can be a scalar to save memory and increase speed.
real evGamble; // they are left as arrays as an example for RL models.
real pGamble;
// loss[i, t]=absolute amount of loss (pre-converted in R)
evSafe = pow(cert[i, t], rho[i]);
evGamble = 0.5 * (pow(gain[i, t], rho[i]) - lambda[i] * pow(loss[i, t], rho[i]));
pGamble = inv_logit(tau[i] * (evGamble - evSafe));
gamble[i, t] ~ bernoulli(pGamble);
}
}
}
generated quantities {
real<lower=0, upper=2> mu_rho;
real<lower=0, upper=5> mu_lambda;
real<lower=0, upper=30> mu_tau;
real log_lik[N];
// For posterior predictive check
real y_pred[N, T];
// Set all posterior predictions to 0 (avoids NULL values)
for (i in 1:N) {
for (t in 1:T) {
y_pred[i, t] = -1;
}
}
mu_rho = Phi_approx(mu_pr[1]) * 2;
mu_lambda = Phi_approx(mu_pr[2]) * 5;
mu_tau = Phi_approx(mu_pr[3]) * 30;
{ // local section, this saves time and space
for (i in 1:N) {
log_lik[i] = 0;
for (t in 1:Tsubj[i]) {
real evSafe; // evSafe, evGamble, pGamble can be a scalar to save memory and increase speed.
real evGamble; // they are left as arrays as an example for RL models.
real pGamble;
evSafe = pow(cert[i, t], rho[i]);
evGamble = 0.5 * (pow(gain[i, t], rho[i]) - lambda[i] * pow(fabs(loss[i, t]), rho[i]));
pGamble = inv_logit(tau[i] * (evGamble - evSafe));
log_lik[i] += bernoulli_lpmf(gamble[i, t] | pGamble);
// generate posterior prediction for current trial
y_pred[i, t] = bernoulli_rng(pGamble);
}
}
}
}
The code I’m using for fitting in python is the following:
# define conditions and path
import os, sys
import pystan
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import time
from datetime import timedelta
import scipy.stats as stats
data_path = '/home/user/LossAversion/'
model_folder = "/home/user/LossAversion/stan_files/"
#prepare model
model = 'ra_prospect.stan'
model_path = os.path.join(model_folder, model)
summary_dict = None
start = time.time()
# Compile the model
start_compile = time.time()
stan_model_RA = pystan.StanModel(file=model_path)
end_compile = time.time()
print('Duration of compilation: ', timedelta(seconds=end_compile - start_compile))
# run the model
start_model = time.time()
fit_ra_cleared = stan_model_RA.sampling(data=RA_prep_test,
chains=4, iter=2000, warmup=1000, thin=1, init='random', verbose=True,
control = {"adapt_delta":0.95, "stepsize":1, "max_treedepth":10}, n_jobs=-1)
end_model = time.time()
print('Duration of the model run is: ', timedelta(seconds=end_model - start_model))
# Create an ordered dictionary
summary_dict = fit_ra_cleared.summary()
# Create a data frame out of the output
df_cleared = pd.DataFrame(summary_dict['summary'],
columns=summary_dict['summary_colnames'],
index=summary_dict['summary_rownames'])
end = time.time()
print('Duration of the entire process is: ', timedelta(seconds=end - start))
The subsequent analysis is just reshaping the results, creating a new Data Frame with columns being the parameters and rows subjects and plotting using seaborn and PairGrid. Statistics are done using scipy.stats
and the Pearson and Spearman correlation from them. I can post this code as well if needed (might be), I do not now because it is already quite heavy.
The data are accessible here, data_LA_exp1_STAN_correlation.txt (160.5 KB), and contain a subset of my data.
Thank you all in advance.