Averaging out parameter?

With the following model, is it possible to recover the parameter a[2] from the first model by using only the data and the b parameters from the second linear predictor? If so, please could you provide an outline of an implementation of how you going about doing so?

data{    
  int N;    
  real y[N];    
  real trt[N];    
  real sex[N];    
}    
parameters{    
  vector[2] a;    
  vector[3] b;    
  real<lower=0> sd_e1;    
  real<lower=0> sd_e2;    
}    
transformed parameters{    
}      
model{    
  vector[N] mu1;    
  vector[N] mu2;    
  target += exponential_lpdf(sd_e1 | 1);    
  target += exponential_lpdf(sd_e2 | 1);    
  target += normal_lpdf(a | 0, 5);    
  target += normal_lpdf(b | 0, 5);    
  for(i in 1:N){    
    mu1[i] = a[1] + a[2] * trt[i];    
    mu2[i] = b[1] + b[2] * trt[i] + b[3] * sex[i];    
  }    
  target += normal_lpdf(y | mu1, sd_e1);    
  target += normal_lpdf(y | mu2, sd_e2);    
}  

Data generation

set.seed(1)
N <- 100
b <- c(0, 2, -3)
trt <- rbinom(N, 1, 0.5)
sex <- rbinom(N, 1, 0.5)
mu <- b[1] + b[2]*trt + b[3]*sex
y <- rnorm(N, mu, 1)
y

Running model:

mod1 <- rstan::stan_model(
  file = "linreg1.stan",
  auto_write = TRUE)
ld <- list(N = N, y = y, x1 = trt, x2 = sex)
l1 <- rstan::sampling(mod1,
                      data    = ld,
                      chains  = 1,
                      thin    = 1,
                      iter    = 2000,
                      warmup  = 1000,
                      refresh = 1000)
print(l1)

Output:

         mean se_mean   sd    2.5%     25%     50%     75%   97.5% n_eff Rhat
a[1]    -1.59    0.01 0.28   -2.12   -1.78   -1.58   -1.40   -1.06   822    1
a[2]     1.87    0.01 0.40    1.08    1.60    1.87    2.15    2.66   809    1
b[1]     0.15    0.01 0.17   -0.18    0.04    0.14    0.26    0.49   612    1
b[2]     1.74    0.01 0.18    1.37    1.62    1.75    1.88    2.09   852    1
b[3]    -3.11    0.01 0.19   -3.51   -3.24   -3.10   -2.98   -2.73   681    1
sd_e1    1.84    0.00 0.13    1.62    1.74    1.84    1.93    2.11   990    1
sd_e2    0.97    0.00 0.07    0.84    0.92    0.96    1.01    1.12   904    1
lp__  -356.64    0.10 2.03 -361.56 -357.72 -356.23 -355.15 -353.80   387    1

linreg1.stan (584 Bytes)

You have y on the left of a | twice in your model, which is very unusual; what are you trying to achieve by that?

Hi Mike, it just represents two linear models and associated sets of parameters for the same outcome variable y. The intention was purely to serve as an example setting for the question I posed. Specifically, given parameters and the data from the second model, is it possible to compute the unconditional treatment effect from the first model?

For clarity, the two separate models are shown below. The first model has a covariate for treatment and second has covariates for treatment and sex. The single model I provided is equivalent to the two separate models.

data{
  int N;
  real y[N];
  real trt[N];
}
parameters{
  vector[2] a;
  real<lower=0> sd_e1;
}
transformed parameters{
}
model{
  vector[N] mu1;
  target += exponential_lpdf(sd_e1 | 1);
  target += normal_lpdf(a | 0, 5);
  for(i in 1:N){
    mu1[i] = a[1] + a[2] * trt[i];
  }
  target += normal_lpdf(y | mu1, sd_e1);
}

and

data{
  int N;
  real y[N];
  real trt[N];
  int sex[N];
}
parameters{
  vector[3] b;
  real<lower=0> sd_e2;
}
transformed parameters{
}
model{
  vector[N] mu2;
  target += exponential_lpdf(sd_e2 | 1);
  target += normal_lpdf(b | 0, 5);
  for(i in 1:N){
    mu2[i] = b[1] + b[2] * trt[i] + b[3] * sex[i];
  }
  target += normal_lpdf(y | mu2, sd_e2);
}

which, after sampling from the respective posteriors gives

         mean se_mean   sd    2.5%     25%     50%     75%   97.5% n_eff Rhat
a[1]    -1.59    0.01 0.25   -2.08   -1.76   -1.59   -1.41   -1.12   519    1
a[2]     1.86    0.02 0.36    1.16    1.62    1.85    2.10    2.57   501    1
sd_e1    1.84    0.01 0.14    1.61    1.75    1.83    1.93    2.13   676    1

and

         mean se_mean   sd    2.5%     25%     50%     75%   97.5% n_eff Rhat
b[1]     0.13    0.01 0.17   -0.18    0.02    0.13    0.26    0.47   379    1
b[2]     1.76    0.01 0.19    1.41    1.63    1.76    1.89    2.14   547    1
b[3]    -3.11    0.01 0.19   -3.47   -3.25   -3.11   -2.98   -2.76   478    1
sd_e2    0.97    0.00 0.07    0.83    0.91    0.97    1.01    1.11   803    1

Basically the same as I posted originally.

Anyway, in the second model, the treatment effect is more precise than the first model because some of the variation is explained by sex. I just want to know if it is possible to obtain an unconditional estimate for the treatment effect from the second model (i.e. equivalent to a[2] from the first model) without having to fit model 1. If so, then how? Conceptually, I thought that I would just need to average out b[3] but perhaps that is not so.

What confuses me about the question is that it’s possible to compute a[2] from just the data (i.e. by fitting model 1). You’re asking about whether you can compute a[2] from just the data and the b parameters from the second model, but you don’t need any parameters from the second model.

1 Like

To give a bit more context to this, imagine that you have a model that is being used to monitor the probability of adverse events associated with some ongoing treatment in a large population. The treatment is not randomised and the model has several covariates, such as decadal age, sex, chronic medical conditions, ethnicity, locality of residence etc although none of these are assume interact. Now consider a setting where you want to report on the results for a particular set of characteristics; say adverse event rates at each age group. What is the most common way to compute these?

One option might be to compute the adverse event rates for each age group conditional on all the other characteristics being at their means from your sample. This doesn’t average out the parameters, but is it reasonable? How would you do this where you have some variable with an nominal scale such as the chronic medical conditions mentioned above?

What results? The expectation for that group conditional on:

  • The group having the characteristics observed in the sample for that group?
  • The characteristics observed in the sample across all groups?
  • An independent expectation for the population distribution of covariate values for that group?
  • An independent expectation for the population distribution of covariate values across all groups?

Depending on your goals, you might:

  • Under strong assumptions about the correctness of the model, you can understand how an age group compares to other age groups, conditional on individuals having the same other characteristics by just looking at the coefficients associated with the different age groups.
  • You can report on the average rate for a given age group conditional on your (nonrandom) sampled distribution of the other covariates for that age group by simply predicting from your model.
  • You can report a prediction for a given age group given some arbitrary set of covariate values for the remaining covariates, which could for example be the sample mean or some independent expectation for the population mean. Under strong assumptions about the independence of covariates and the linearity of all terms, this will give you the expected prediction for the age group if you can choose mean covariate values that correspond to the true mean values for that age group.
  • More generally, you can post-stratify to try to get a better sense of what the true predicted value for a given age group should be.
1 Like