Putting Priors on Transformed Parameters

Hi everyone, I am trying to reash an old post, since I was unable to find a solution to a modelling problem. Together with some collegues, I am working on a meta-analysis with a dataset of several RCTs where treatment comprised a combination of several aspects or features (e.g. length of treatment, frequency of treatment, type of treatment, et cetera); the objective of the model is to understand whether certain features of treatment combinations are more effective than others.

The first layer models treatment combinations averages as:

y \sim N(\theta, 1)

Where \theta represents average combination treatment effects, which aren’t of direct interest. What we are interested in, is the transformation \tau = R \theta, which transforms the vector \theta into average feature treatment effects, by taking weighted averages across elements of \theta where the certain feature is present in the combination. This is done across several RCTs and the last hierarchical layer is the parameter \lambda, which represents the common mean of all the \tau, i.e.:

\tau \sim N(\lambda, \sigma_{\tau})
\lambda \sim N(0, \sigma_{\lambda})

So what really interests us is \lambda, but we need to go through the vectors \tau to obtain it. As for the relationship between J and K, unfortunately it varies from RCT to RCT, sometimes one is greater than the other, sometimes the opposite and sometimes they are equal. I am going to post the complete code below for clarity. The issue is that the model, as I have currently coded it, does not update the values of \tau and \lambda, from the values of \theta, and so they are just zeros. I was wondering if it’s just poor coding on my part, or this procedure cannot be done on Stan (or in general).

data {
  int<lower=0> C;                 //number of RCTs
  int<lower=0> K1;                 //number of treatment combinations in RCT 1
  int<lower=0> K2;                 //number of treatment combinations in RCT 2                         
  vector[K1] y1;                   //treatment combinations in RCT 1
  vector[K2] y2;                   //treatment combinations in RCT 2
  int<lower=0> J1;                 //number of treatment features in RCT 1
  int<lower=0> J2;                 //number of treatment features in RCT 2
  int<lower=0> Jmax;
  int feature1[J1];               //feature indexes in RCT 1
  int feature2[J2];               //feature indexes in RCT 2             
  matrix[J1, K1] R1;               //transformation matrix in RCT 1
  matrix[J2, K2] R2;               //transformation matrix in RCT 2 

parameters {
  vector[K1] theta1;              //expectation of treatment combinations in RCT 1
  vector[K2] theta2;              //expectation of treatment combinations in RCT 2             
  real<lower=0> sigma_y;                       // global scale parameter for the ys
  real<lower=0> sigma_tau;                    // global scale parameter for the thetas
  real<lower=0> sigma_lambda;             // global scale parameter for the lambdas
  vector[Jmax] lambda;

transformed parameters{
  vector[J1] tau1 = R1*theta1;              //feature treatment effects in RCT 1
  vector[J2] tau2 = R2*theta2;              //feature treatment effects in RCT 2             

model {
  vector[J1] lambda1;              //common feature treatment effect expectiation in RCT 1
  vector[J2] lambda2;              //common feature treatment effect expectiation in RCT 2             
  for (n in 1:J1){
    lambda1[n] = lambda[feature1[n]];
  for (n in 1:J2){
    lambda2[n] = lambda[feature2[n]];
  // Likelihood of the treatment combination means
  y1 ~ normal(theta1, sigma_y);
  y2 ~ normal(theta2, sigma_y);
  // Prior for the treatment feature ATEs by country (2nd Hierarchical layer)
  tau1 ~ normal(lambda1, sigma_tau);
  tau2 ~ normal(lambda2, sigma_tau);
  // Prior for the common treatment feature ATEs (3nd Hierarchical layer)
  lambda ~ normal(0, sigma_lambda);

Looks like your model is missing a priors for sigma_lambda, sigma_tau, and sigma_y.

1 Like

That is not very important: Stan puts default flat priors on those parameters.

You’ve double checked that R_1 and R_2 are not zero matrices? Because then it seems very weird if \theta_1 and \theta_2 are non-zero but \tau_1 and \tau_2 are.
Maybe add print statements in a few places to make sure various things are what you expect?

More specifically, for bounded parameters, Stan will place a uniform flat prior. But these parameters are not bounded (upper bound is infinity) and, thus, they will be improper.

Regardless for those coming to learn in this thread, just because a parameter has a (default) prior, does not mean it will not have an effect. Having not reviewed your data and model, I do not know whether that’s the case here.