Models on models within stan

Hi!
Hope this message finds you well.

In order to propagate uncertainty, I am trying to find examples where all together in the same stan document, the output from the generated quantities of a model is used as the response variable for a second model.

I was originally doing this as separate steps:

  1. I get my observed data x (at a site level), fit a model to it where the likelihood is x[i]~N (mu[i],sigma), and then in the generated quantities I use the estimated parameters and data x to calculate a site-level metric, x2 (which has uncertainty).
  2. Outside of stan, per country, I calculate a weighted mean of x2, and classify a country as 1 or 0 is that weighted mean passes a threshold, so I end up having a variable x3 at a country level that is presence/absences.
  3. Then I run a logistic model to understand which country-level factors can predict that X3 variable. I get a distribution for the slopes of those country-factors (i.e., uncertainty in the estimated slope from all iterations).

However, someone pointed out that I should do it all together in stan to propagate uncertainty, and I would like to learn how to do it. What I described above would not propagate the uncertainty from X2 to the step 3 (i.e., slope uncertainty in step 3 does not include the uncertainty from X2).

So my question is, are there any examples out there that do this sort of thing (models on models within stan)? Alternatively, is there a way to add two model components in the same stan document. For example:
data{
observed data at a site level
site-level predictors of first model
country-level predictors of second model
}
parameters{
parameters for model 1
parameters for model 2
}
transformed parameters{
transformed parameters for model 1
transformed parameters for model 2
}
model1{
likelihood and priors for site-level response variable
}
generated quantities{
new variable based on observed data and parameters
calculate weighted mean per country and calculate presence/absence depending on threshold: second response variable
}
model 2{
likelihood and priors for second response variable (country-level)
}

Hope my message is clear and sorry for the matters. I do not have a reproducible example because I just want to know if this can be done and if there are any examples out there that do this so I can learn how to do it. Thank you in advance!

Best regards
Jessica

Hi @Jessica_ZM, I’ll take a stab at this, although you might be better off waiting for someone with more expertise to weigh in.

Unless your generated quantities do something prohibited in a transformed parameters block, I think you can just put those calculations into the transformed parameters block and then put model 1 & model 2 into just one model block, so:

data{
  #observed data at a site level
  #site-level predictors of first model
  #country-level predictors of second model
}
parameters{
  #params for model 1
  #params for model 2
}
transformed parameters{
  #transformed parameters for model 1
  #transformed parameters for model 2

  #calculate weighted mean per country
}
model {
  #likelihood and priors for site-level response variable
  #likelihood and priors for second response variable (country-level)
}

The only issue I can see has to do with the thresholding/binarizing of the x3 parameter, since I don’t think Stan can model discrete parameters directly.

Instead of logistic regression on 0/1, could you instead model the the un-thresholded value (the weighted mean of x2)?