Covariance matrix priors

levis · December 3, 2023, 3:17pm

i am trying to study whether the age of participants covaries with other hyperparameters with the following model below:

I have a prior on L ~eta(23) and then separately set L(3,2)~beta(9,1) to see how a strong correlation between age and b1 effects the other parameters.

But for some reason, this had no effect on the model fits (parameters fits were the same with or without setting L).

any insight into what im missing?

thanks in advance

data {

    // Metadata
    int<lower=1>  N; 
    int<lower=1> NJ; // number of total subjects; 
// Number of total observations
    array[N] int<lower=1>  J;               // Subject-indicator per observation
    array[N] int<lower=1>  K;               // Bandit-indicator per observation
    vector[NJ] age_zscored;  // Z-scored age
    vector[NJ] age_squared_zscored;  // Z-scored age

    // Data
    array[N] int<lower=0, upper=1>  Y;      // Response (risky = 1, certain = 0)
    array[N] int<lower=0, upper=1>  R;      // Outcome (reward = 1, nonreward = 0)

}
transformed data {

   // int  NJ = max(J);                       // Number of total subjects
    int  NK = max(K);                       // Number of total bandits
}
parameters {

    // Participant parameters
    vector[2]     theta_mu; //subject means ; 
    matrix[2,NJ]  theta_pr_minus2;  
    vector<lower=0>[2] sigma;  
    cholesky_factor_corr[3] L;  


}
transformed parameters {
    matrix[3,NJ]  theta_pr;
    vector<lower=0>[3] sigma_pr;
    sigma_pr[3]=1;
    sigma_pr[1:2]=sigma;
    theta_pr[1:2,]=theta_pr_minus2;
    theta_pr[3,] = to_row_vector(age_zscored);
    vector[NJ]  b1;                         // Inverse temperature
    vector[NJ]  a1;                         // Learning rate (pos)


    // Construction block
    {
    
    // Rotate random effects
    matrix[NJ,3] theta = transpose(diag_pre_multiply(sigma_pr, L) * theta_pr);

    
    // Construct random effects
    b1 = (theta_mu[1] + theta[,1]) * 10;
    a1 = Phi_approx(theta_mu[2] + theta[,2]);



    }

}
model {

    // Initialize Q-values

    array[NJ, NK] real Q = rep_array(0.5, NJ, NK);

    // Construct linear predictor
    vector[N] mu;
    for (n in 1:N) {
    
        // Compute (scaled) difference in expected values
        mu[n] = b1[J[n]] * (Q[J[n],K[n]] - 0.5);
        
        // Compute prediction error
        real delta = R[n] - Q[J[n],K[n]];
        
        // Assign learning rate
        real eta = a1[J[n]];
        
        // Update state-action values
        Q[J[n],K[n]] += eta * delta;
        
    }
    
    // Likelihood
    target += bernoulli_logit_lpmf(Y | mu); 
    
    // Priors
    target += normal_lpdf(theta_mu | 0, 2);
   // target += std_normal_lpdf(to_vector(theta_pr_minus2));
    target += std_normal_lpdf(to_vector(theta_pr_minus2));

    target += student_t_lpdf(sigma | 3, 0, 1);
   // age_zscored ~ normal(predicted_age,.05);
  //  age_squared_zscored ~ normal(predicted_age_squared,.05);
     target += lkj_corr_cholesky_lpdf(L | 23);
  //   L[2,1] ~ beta(9,1);
     L[3,2] ~ beta(9,1);

          // prior correlation


}

generated quantities {
  
  matrix[3,3] Omega = multiply_lower_tri_self_transpose(L);
}

Corey.Plate · December 4, 2023, 5:29pm

Statements in the model block are executed sequentially and you have the sampling statements for L after the lkj_corr_cholesky_lpdf() function that calls them.

jsocolar · December 4, 2023, 5:39pm

@Corey.Plate the order of the below statements makes no difference in this Stan program. L[3,2] ~ beta(9,1); is just syntactic sugar for target += beta_lpdf(L[3,2] | 9, 1). Sampling statements in Stan are not like sampling statements in BUGS/JAGS, where the variables are not populated with values prior to being sampled.

Corey.Plate · December 4, 2023, 5:47pm

Sorry about that, I should have linked to my source. I did not realize that doesn’t apply to sampling statements

levis · December 4, 2023, 6:00pm

Thanks so much Corey - you’re saying just changing the order so that L(3,1)~ beta(9,1) is sampled prior to lkj_corr_cholesky_lpdf()
will allow both to be considered?

best,
Levi

jsocolar · December 4, 2023, 6:08pm

It’s true that the statements are executed in order, but since addition is commutative it does not matter the order in which we increment the target.

This will not affect inference

Topic		Replies	Views
Underestimating correlation coefficients with LKJ prior Modeling	24	6653	September 15, 2020
Specifying Correlation in Stan Modeling	10	1063	October 9, 2021
Remove/estimate correlation of parameters (or get to know if correlation is real) Modeling	34	2888	April 16, 2020
Including data in covariance matrix Modeling specification	1	227	May 28, 2024
Learning a correlation matrix Modeling	8	3288	December 13, 2017

Covariance matrix priors

Related topics