I am not able to find the right coverage probability for estimates from models involving interaction terms

puzzler · March 28, 2018, 8:39pm

Hi all, I am so glad to find this helpful community online!
I am using cox model in Rstan (leuk.stan) to test simulated survival data.
I only use one chain and 5000 burns, because I need to calculate thousands of models at once. After some sample tests they work well at small scales.
Then, my model for Y~Trt+X_1+…+X_30(all Xs are iid) works very well. I ran 500 simulations, within each simulation 10,000 MCMC samples were drawn, which were later translated into causal estimates I need. Then I check the bias and coverage probability of the credible intervals: I found almost 95% of the intervals, ranging in (0.025 quantile ,0.975 quantile) of the 10,000 estimates from each simulation, covers the true value.
However, when I turn to model for Y~Trt + X_1+…+X_15+Trt:X_1+…+Trt:X_15, the bias of estimates are still fine, but they have an average of 91%, instead of 95%, of the intervals mentioned above covers the true value.
Because my treatment is binary, I treated Trt:X_i as a first order term in the model, just like X_i. I haven’t yet figured how to write a stan model with interaction terms. I wonder if this is the problem and if yes, what should I do.
I am new to stan and Bayesian methods. Is there anything I miss? Thank you in advance for any comments.

betanalpha · March 28, 2018, 9:28pm

There are no coverage guarantees for Bayesian intervals, in particular the width of a Bayesian posterior interval will not in general equal any definition of coverage. For more discussion you might find https://arxiv.org/abs/1803.08393 helpful.

puzzler · March 29, 2018, 8:11pm

Thank you for your quick reply. I am reading it. Could you give me a brief hint for why, by following the above process with non-informative priors for the constant parameters, it has 95% of the chances those intervals cover the true value? Are all the tests are coincidences, or do I miss something to interpret my simulation in a correct way?

I have checked these links, and I am still a little bit confused. Do some of the answers imply there is a way to find the coverage probability right? Thank you for the help!

betanalpha · March 29, 2018, 9:41pm

This is a result of asymptotics. As your data become more informative relative to the complexity of your model the posterior becomes more and more Gaussian and posterior intervals converge towards maximum likelihood intervals. The fact that you see some posterior intervals with matching frequentist calibration but not others indicates that your model isn’t fully in the asymptotic regime.

Topic		Replies	Views
Precise but wrong results for a drug interaction model Modeling	12	1245	November 28, 2018
How do I evaluate my model, in terms of bias, coverage, etc.? General simulation-based-calibration	17	2866	January 4, 2023
Rstanarm: Posterior standard deviations and credible intervals for odds ratios and probabilities Modeling	2	363	November 9, 2023
Simulation-based calibration: Some challenges and directions for future research Modeling simulation-based-calibration	1	611	September 19, 2021
Explaining how Rstan estimates parameter using ODE and data.! General	25	1366	April 1, 2021

I am not able to find the right coverage probability for estimates from models involving interaction terms

Related topics