Heckman selection model code + simulation

Hi everyone! I found a small bug in the model posted by @rtrangucci . By checking the likelihood against both the stata documentation here (https://www.stata.com/manuals15/rheckman.pdf) and the R package documentation here (https://cran.r-project.org/web/packages/sampleSelection/vignettes/selection.pdf) I think found that there was an unneccessary second division by (sqrt(1-rho)) in the model code above.

Of course, please do reply if you disagree or if you think I’ve misunderstood!

It took me a while to find the bug because it somehow does not affect the performance too badly in the model – in fact, the calibration is still good in the model above! I am not sure why. But I do know that if you try to generalize the model to one in which you can observe the unselected units (but the betas differ across the two types of units) you start getting bad behaviour then.

Below is the fixed model and the generalization (which is sometimes called a tobit-5, but sometimes not, alas) plus R scripts for some simulation-based calibration tests that show they do well :)
fake_data_generalized_heck_montecarlo_calibration.R (4.0 KB)
fake_data_heck_montecarlo_calibration_check.R (2.8 KB)
generalized_heck.stan (1.8 KB)
heck.stan (1.1 KB)

11 Likes