Please also provide the following information in addition to your question:
- Operating System: Windows 10
- rstanarm Version: 2.15.3
I’m using Rstanarm with some imputed datasets, and previously I combined multiple imputed datasets by stacking the posteriors (I believe that has been discussed on this forum previously).
However, I’m thinking of fitting a varying slopes model where I allow the slopes of predictors to vary across imputed datasets. Would the mean of the varying slopes be a valid estimate of the parameter net of imputation uncertainty?
I know I can test this doing a big simulation, but I wanted to reach out first and see if anyone as A) tried this, B) knows resources/papers or C) thinks it’s a really bad idea.
thanks much!
Bob
What’s the hierarchical model? Is the mean the location parameter of something like a normal hierarchical prior?
Is the proper approach to integrate out the individual-level effect (e.g., the random effect in most people’s usage)? That can be the same result as the mean only in certain cases that I’m not a good enough mathematical statistician to identify (linear, maybe?).
Do I understand it correctly that you want to use a hierarchical model to estimate parameters from multiple imputation data sets simultaneously?
As you already alluded to, I would be worried about underestimating the uncertainty from missing data. Regarding the valid (unbiased?) estimation of the average over imputed data sets, I think this would depend the distribution of the parameters (for the multiple imputations) when they are estimated separately. If those parameters are distributed normally, I think the mean could be un-biased (i.e. the same as when estimating parameters in separate models). If the parameters are skewed, this might be different (assuming one uses a normal hierarchical model)
As an aside, brms has a nice infrastructure to use multiply imputed data sets.
Yes, I’m thinking of a hierarchical model. If there are M imputed datasets from the imputation model, then index \theta by m:
\theta_m \sim N(\mu,\sigma)
The question is whether \mu could give you an estimate of \theta that adequately reflects uncertainty in the M datasets.
As Guido pointed out, the key question is whether the normal hyperprior would be making a strong assumption about how the \theta_m are related to each other given non-normality in the data. I was just looking at formulas from Rubin’s book and BDA III, and I would have to do more work to figure out if there is a connection.
The interesting thing is that Rubin gives this formula for combining variance of imputed estimates:
T_M = W_M + \frac{M + 1}{M} B_M
Where W_M is the within-dataset variance of an individual \theta_m and B_M is the between-dataset variance. It looks similar to how BDA III defines the hierarchical variance parameter (5.20):
V^{-1}_{\mu} = \sum^M_{m=1} \frac{1}{\sigma_m^2 + \tau^2}
Where again \sigma_m^2 is the within-imputation dataset variance and \tau^2 is the between-imputation variance.
It would seem the clear difference is for \mu. In Rubin’s book, you simply average the \theta_m to get \theta or \mu, while in a hierarchical model it is a precision or variance-weighted average.
Anyhow, I might pursue this further when I have time, but it doesn’t seem like a terrible idea at first blush, especially if the number of imputed datasets is large and the posterior distribution would then tend toward normality. On the other hand, perhaps a uniform prior on \mu would suffice.
Performing inference based on all posterior samples together should yield valid estimates in general, if I am not mistaken. At least, this is what I do in brms::brm_multiple()
.
1 Like