Issue With Transformed Parameter


Hi everyone, I have an issue in modelling a transformed parameter correctly. The model I am currently running is the following:

data{
int<lower=0> K;
vector[K] y;
matrix[J,K] R;
}
parameters{
vector[K] theta;
vector[J] lambda;
}
transformed parameters{
vector[J] tau = R*theta
}
model {
  y ~ normal(theta, 1);
tau ~ normal(lambda, 1)
lambda ~ normal(0, 1);
} 

What I am really interested in is the vector lambda, however, in order to obtain it, I have to transform the parameter theta into tau, using a simple linear transformation. Unfortunately, the matrix R'R is not invertible, so I cannot express theta as a function of tau. Moreover, the following code is not really working, because both lambda and tau are just vectors of zeros, as it doesn’t correctly update tau from the values of theta. Do you have any idea on how I could fix the issue?

What is the motivation for this model? When you say the code isn’t working, what do you mean? The variables lambda and theta are parameters so they should have different values every draw.

Instead of writing a prior for tau, could you put a prior on theta? Then you don’t run afoul of impossible to adjust changes of variables like mapping a K-vector to a J-vector (the Jacobian isn’t square, so we can’t take a determinant).

If J \leq K, you can pad out the transform and everything should be OK as long as you assign proper priors. For example, if we have a two-vector \alpha = [\alpha_1 \quad \alpha_2] and we map \alpha to [\alpha_1^2 \cdot \alpha_2] then we are going from \mathbb{R}^2 to \mathbb{R}. But if we instead map \alpha to the 2-vector [\alpha_1^2 \cdot \alpha_2 \quad \alpha_2], then we can calculate the Jacobian of the absolute value of the determinant. Just be careful to give both \alpha_1^2 \cdot \alpha_2 and \alpha_2 a prior if you want a proper prior induced on \alpha.

If you’re not interested in tau, you can move the definition down to a local variable in the model block. Then it won’t get printed out every iteration.

1 Like

Hi Bob, thank you for your answer. We are working for a meta-analysis with a dataset of several RCTs where treatment comprised a combination of several aspects or features (e.g. length of treatment, frequency of treatment, type of treatment, et cetera), the objective of the model is to understand whether certain features of treatment combinations are more effective than others.

The first layer models treatment combinations averages as:

y \sim N(\theta, 1)

Where \theta represents average combination treatment effects, which aren’t of direct interest. The transformation \tau = R \theta transforms the vector \theta into average feature treatment effects, by taking weighted averages across elements of \theta where the certain feature is present in the combination. Now, this is done across several RCTs and the parameter \lambda represents the common mean of all the \tau.

So what really interests us is \lambda, but we need to go through the vectors \tau to obtain it. As for the relationship between J and K, unfortunately it varies from RCT to RCT, sometimes one is greater than the other, sometimes the opposite and sometimes they are equal. I am going to post the complete code below for clarity. The issue is that the model, as I have currently coded it, does not update the values of \tau and \lambda, from the values of \theta, and so they are just zeros. I was wondering if it’s just poor coding on my part, or this procedure cannot be done on Stan (or in general).

data {
  int<lower=0> C;                 //number of RCTs
  int<lower=0> K1;                 //number of treatment combinations in RCT 1
  int<lower=0> K2;                 //number of treatment combinations in RCT 2
  int<lower=0> K3;                 //number of treatment combinations in RCT 3
  int<lower=0> K4;                 //number of treatment combinations in RCT 4
  int<lower=0> K5;                 //number of treatment combinations in RCT 5
  int<lower=0> K6;                 //number of treatment combinations in RCT 6
  int<lower=0> K7;                 //number of treatment combinations in RCT 7
  int<lower=0> K8;                 //number of treatment combinations in RCT 8
  int<lower=0> K9;                 //number of treatment combinations in RCT 9                           
  vector[K1] y1;                   //treatment combinations in RCT 1
  vector[K2] y2;                   //treatment combinations in RCT 2
  vector[K3] y3;                   //treatment combinations in RCT 3
  vector[K4] y4;                   //treatment combinations in RCT 4
  vector[K5] y5;                   //treatment combinations in RCT 5
  vector[K6] y6;                   //treatment combinations in RCT 6
  vector[K7] y7;                   //treatment combinations in RCT 7
  vector[K8] y8;                   //treatment combinations in RCT 8
  vector[K9] y9;                   //treatment combinations in RCT 9
  int<lower=0> J1;                 //number of treatment features in RCT 1
  int<lower=0> J2;                 //number of treatment features in RCT 2
  int<lower=0> J3;                 //number of treatment features in RCT 3
  int<lower=0> J4;                 //number of treatment features in RCT 4
  int<lower=0> J5;                 //number of treatment features in RCT 5
  int<lower=0> J6;                 //number of treatment features in RCT 6
  int<lower=0> J7;                 //number of treatment features in RCT 7
  int<lower=0> J8;                 //number of treatment features in RCT 8
  int<lower=0> J9;                 //number of treatment features in RCT 9
  int<lower=0> Jmax;
  int feature1[J1];               //feature indexes in RCT 1
  int feature2[J2];               //feature indexes in RCT 2             
  int feature3[J3];               //feature indexes in RCT 3
  int feature4[J4];               //feature indexes in RCT 4
  int feature5[J5];               //feature indexes in RCT 5
  int feature6[J6];               //feature indexes in RCT 6
  int feature7[J7];               //feature indexes in RCT 7
  int feature8[J8];               //feature indexes in RCT 8
  int feature9[J9];               //feature indexes in RCT 9
  matrix[J1, K1] R1;               //transformation matrix in RCT 1
  matrix[J2, K2] R2;               //transformation matrix in RCT 2
  matrix[J3, K3] R3;               //transformation matrix in RCT 3
  matrix[J4, K4] R4;               //transformation matrix in RCT 4
  matrix[J5, K5] R5;               //transformation matrix in RCT 5
  matrix[J6, K6] R6;               //transformation matrix in RCT 6
  matrix[J7, K7] R7;               //transformation matrix in RCT 7
  matrix[J8, K8] R8;               //transformation matrix in RCT 8
  matrix[J9, K9] R9;               //transformation matrix in RCT 9
}

parameters {
  vector[K1] theta1;              //expectation of treatment combinations in RCT 1
  vector[K2] theta2;              //expectation of treatment combinations in RCT 2             
  vector[K3] theta3;              //expectation of treatment combinations in RCT 3
  vector[K4] theta4;              //expectation of treatment combinations in RCT 4
  vector[K5] theta5;              //expectation of treatment combinations in RCT 5
  vector[K6] theta6;              //expectation of treatment combinations in RCT 6
  vector[K7] theta7;              //expectation of treatment combinations in RCT 7
  vector[K8] theta8;              //expectation of treatment combinations in RCT 8
  vector[K9] theta9;              //expectation of treatment combinations in RCT 9
  real<lower=0> sigma_y;                     // global scale parameter for the ys
  real<lower=0> sigma_tau;                   // global scale parameter for the thetas
  real<lower=0> sigma_lambda;                // global scale parameter for the lambdas
  vector[Jmax] lambda;
}

transformed parameters{
  vector[J1] tau1 = R1*theta1;              //feature treatment effects in RCT 1
  vector[J2] tau2 = R2*theta2;              //feature treatment effects in RCT 2             
  vector[J3] tau3 = R3*theta3;              //feature treatment effects in RCT 3
  vector[J4] tau4 = R4*theta4;              //feature treatment effects in RCT 4
  vector[J5] tau5 = R5*theta5;              //feature treatment effects in RCT 5
  vector[J6] tau6 = R6*theta6;              //feature treatment effects in RCT 6
  vector[J7] tau7 = R7*theta7;              //feature treatment effects in RCT 7
  vector[J8] tau8 = R8*theta8;              //feature treatment effects in RCT 8
  vector[J9] tau9 = R9*theta9;              //feature treatment effects in RCT 9
}

model {
  vector[J1] lambda1;              //common feature treatment effect expectiation in RCT 1
  vector[J2] lambda2;              //common feature treatment effect expectiation in RCT 2             
  vector[J3] lambda3;              //common feature treatment effect expectiation in RCT 3
  vector[J4] lambda4;              //common feature treatment effect expectiation in RCT 4
  vector[J5] lambda5;              //common feature treatment effect expectiation in RCT 5
  vector[J6] lambda6;              //common feature treatment effect expectiation in RCT 6
  vector[J7] lambda7;              //common feature treatment effect expectiation in RCT 7
  vector[J8] lambda8;              //common feature treatment effect expectiation in RCT 8
  vector[J9] lambda9;              //common feature treatment effect expectiation in RCT 9
  
  for (n in 1:J1){
    lambda1[n] = lambda[feature1[n]];
  }
  for (n in 1:J2){
    lambda2[n] = lambda[feature2[n]];
  }
  for (n in 1:J3){
    lambda3[n] = lambda[feature3[n]];
  }
  for (n in 1:J4){
    lambda4[n] = lambda[feature4[n]];
  }
  for (n in 1:J5){
    lambda5[n] = lambda[feature5[n]];
  }
  for (n in 1:J6){
    lambda6[n] = lambda[feature6[n]];
  }
  for (n in 1:J7){
    lambda7[n] = lambda[feature7[n]];
  }
  for (n in 1:J8){
    lambda8[n] = lambda[feature8[n]];
  }
  for (n in 1:J9){
    lambda9[n] = lambda[feature9[n]];
  }
  
  // Likelihood of the treatment combination means
  y1 ~ normal(theta1, sigma_y);
  y2 ~ normal(theta2, sigma_y);
  y3 ~ normal(theta3, sigma_y);
  y4 ~ normal(theta4, sigma_y);
  y5 ~ normal(theta5, sigma_y);
  y6 ~ normal(theta6, sigma_y);
  y7 ~ normal(theta7, sigma_y);
  y8 ~ normal(theta8, sigma_y);
  y9 ~ normal(theta9, sigma_y);
  
  // Prior for the treatment feature ATEs by country (2nd Hierarchical layer)
   
  tau1 ~ normal(lambda1, sigma_tau);
  tau2 ~ normal(lambda2, sigma_tau);
  tau3 ~ normal(lambda3, sigma_tau);
  tau4 ~ normal(lambda4, sigma_tau);
  tau5 ~ normal(lambda5, sigma_tau);
  tau6 ~ normal(lambda6, sigma_tau);
  tau7 ~ normal(lambda7, sigma_tau);
  tau8 ~ normal(lambda8, sigma_tau);
  tau9 ~ normal(lambda9, sigma_tau);
  
  // Prior for the common treatment feature ATEs (3nd Hierarchical layer)
  lambda ~ normal(0, sigma_lambda);
}

Hi @Bob_Carpenter, I was wondering if you had a chance to look at the model. I just wanted to know if it’s a coding error or it’s something not feasible on Stan.