I am seeking to predict, from individual level hospital discharge data, at population level, the prevalence of what are called ‘nursing-sensitive outcomes’. These are things like hospital acquired pressure ulcer, delirium, urinary tract infection and pneumonia. They are known to be poorly recorded in routine discharge data (like the UK HES).
I have
- A detailed chart review of 1000 people from one hospital, which gives me the gold standard ‘truth’.
- For those 1,000 patients in that hospital, the routine data for those discharges - on which I train my prediction model.
- The corresponding individual level national discharge data, for 120,000 episodes, from which I desire to predict the prevalence of adverse outcomes
All this works fine, fitting logistic models in rstanarm. (It also works fine fitting ranger models in mlr3).
I would like to explore further the effect of hospitals and hospital types (there are 2). To do this, I have other adverse outcomes, believed to be well-recorded, in the national data, for example, death in hospital, transfer to long-term care, and others.
Is there a systematic way to do this? I have thought of using the other adverse outcomes to set up some comparison between the hospitals. The logic, which I cannot test, but does fir with the literature, is that the well-recorded adverse outcomes (Death etc,), will be correlated with the poorly recorded adverse outcomes (Delirium etc.).
I am well aware that I lack the data to directly estimate the impact of say, Hospital Type 2, or Hospital 12, but I am interested in capturing, even roughly, variability due to between hospital variation. It is almost 100% certain that there is such variability.
One way would to use the national data to set informative priors on the hospital level effects, but I’m not sure that this makes sense, nor do I see how to do it. I’ve looked for examples of similar problems, and failed to find them. Another would be to estimate some kind of latent trait model, but I’m not sure how to glue that together with my main logistic model in Stan.
Any suggestions, references, comments would be very welcome.