Yes, I’m thinking of a hierarchical model. If there are M imputed datasets from the imputation model, then index \theta by m:

\theta_m \sim N(\mu,\sigma)

The question is whether \mu could give you an estimate of \theta that adequately reflects uncertainty in the M datasets.

As Guido pointed out, the key question is whether the normal hyperprior would be making a strong assumption about how the \theta_m are related to each other given non-normality in the data. I was just looking at formulas from Rubin’s book and BDA III, and I would have to do more work to figure out if there is a connection.

The interesting thing is that Rubin gives this formula for combining variance of imputed estimates:

T_M = W_M + \frac{M + 1}{M} B_M

Where W_M is the within-dataset variance of an individual \theta_m and B_M is the between-dataset variance. It looks similar to how BDA III defines the hierarchical variance parameter (5.20):

V^{-1}_{\mu} = \sum^M_{m=1} \frac{1}{\sigma_m^2 + \tau^2}

Where again \sigma_m^2 is the within-imputation dataset variance and \tau^2 is the between-imputation variance.

It would seem the clear difference is for \mu. In Rubin’s book, you simply average the \theta_m to get \theta or \mu, while in a hierarchical model it is a precision or variance-weighted average.

Anyhow, I might pursue this further when I have time, but it doesn’t seem like a terrible idea at first blush, especially if the number of imputed datasets is large and the posterior distribution would then tend toward normality. On the other hand, perhaps a uniform prior on \mu would suffice.