Computing ICC-like reliability (& within-unit variance) for hierarchical models?

Just a quick update to this topic in case anyone has interests in this domain. While I think I worked out how to obtain the single-session reliability of a measure with Gaussian-distributed measurement noise through the eqns posted above, I was having trouble working out a similar solution to the binomial outcome case and also had some other issues with extracting quantities of interest from hierarchical data like this, so I came up with a much more brute-force solution that is also far more flexible.

The approach I’m now working on is:

  • Sample hierarchical model with whatever contrast scheme makes most sense for appropriate regularization, etc. (so, no need for complicated contrasts to split multi-session data to sample test-retest correlations; that quantity will be derivable later) and for each sample in the posterior:
    • extract the parameters associated with characterizing the group-level multivariate structure (means, sds, correlation matrix) and use this structure to generate many simulated subjects, then within each:
      • compute any derived quantity of interest (\Psi_{true})
      • generate lots of simulated observation-level datasets using the same structure as that which obtained the original real data, and in each:
        • Use Stan to estimate the simSubj’s “true” parameters, using the group-level multivariate structure as priors, then for each sample in the resulting posterior
          • compute each DQI (\Psi_{est})
        • collapse the resulting posteriors on each \Psi_{est} to some point estimate (ex. mean/median). (It’s possible that this part needs to be avoided, causing later steps to have to instead sample from each posterior, but going with this for now for compute/storage minimization)
      • Each DQI for this simSubj now has a distribution of values from many simulated sessions, so within each, compute a mean and an SD (\mu_{\Psi_{est}}, \sigma_{\Psi_{est}})
    • Each simSubj:DQI combo now has a true value, a mean estimated value and the sd of the estimated values, so now we can compute:
      • core quantities:
        • group-level correlations among \Psi_{true} (\rho_{\Psi_{true}})
        • SD of \Psi_{true} (\sigma_{\Psi_{true}}, true between-Ss variability in the DQI)
        • Mean of \sigma_{\Psi_{est}} (\mu_{\sigma_{\Psi_{est}}}; mean within-Ss variability in the DQI)
        • “reliability” of \Psi (\Phi_{\Psi}) via \frac{\sigma_{\Psi_{true}}}{\sigma_{\Psi_{true}} + \mu_{\sigma_{\Psi_{est}}}}
        • (if interested, one could also compute here each simSubj’s individual reliability, which wouldn’t be of interest by itself given they’re not real subjects, but the variability of the by-subject reliabilities might be of interest)
      • extras (not sure these acutally have usefulness)
        • group-level correlations among \Psi_{est} (\rho_{\Psi_{est}})
        • SD of \Psi_{est}
        • SD of \sigma_{\Psi_{est}} (\sigma_{\sigma_{\Psi_{est}}}; variability of within-Ss variability in the DQI)
        • “reliability” of \Psi_{est} (\Phi_{\Psi_{est}}) via \frac{\sigma_{\Psi_{est}}}{\sigma_{\Psi_{est}} + \mu_{\sigma_{\Psi_{est}}}}
  • Now we have a posterior distribution on each of the core and extra quantities!
3 Likes