a bit of background: I am trying to model the effects of seven government interventions on the number of new COVID infections across multiple countries in a semi-mechanistic model. So far, I was able to get a reasonable fit with just population-level effects for the interventions. Now, I want to include country-level slopes for the interventions, but this is challenging because the interventions were implemented in close succession in most countries. As a result, the effects of interventions are negatively correlated (“one effect stealing from the other”) both at the population- and also at the country-level. Further, the posterior estimate of the between-country standard deviation differs greatly across interventions. However, I have the strong prior belief that one measure should not vary by multiple orders of magnitude more than the other. Nevertheless, it is one of the goals of the analysis to find out if some interventions exhibit more variation between countries than others.
So far, I have modeled the population-level and country-level slopes with a non-centered parametrization using the Cholesky Decomposition for sampling the varying slopes from a multivariate normal. Although I put quite informative priors on the between-country standard deviations
tau ~ student_t(4, 0, 0.04), my posterior estimate for two interventions is off the charts: tau = 0.5 [0.3, 0.7] and they are also highly negatively correlated (rho = -0.7). For all other interventions, the posterior estimate is practically zero. It is very unlikely that just these two interventions should vary that much. Besides, the model doesn’t converge well (the intercept and varying intercept cannot be reliably estimated).
My question: What could be informative priors for tau that incorporate my prior belief that the slope of one intervention does not vary too much more than the the slope from another? At the same time, the prior should not be too informative, so that I can still figure out if some interventions vary more than others.
I had two ideas so far. One was to set a prior on the total variation and then set informative priors on the proportion of the total variation that is explained by each intervention. Yet, I couldn’t figure out how to do this properly.
Another idea was to do something like this
tau ~ student_t(4, 0, omega);
omega ~ half-normal(0, 0.1);
Does it make sense to pursue one of these ideas further or am I overlooking a better approach? Maybe also the data just doesn’t allow it to estimate varying slopes reliably in this case, which is a bit my feeling…
I would be very interested to hear your ideas!
I am on a phone, so can’t easily check, but I think
makemyprior project/package does that and provides some explanations and maybe even Stan code. Feel free to ask for clarifications if the instructions there are not clear/applicable.
I am a bit less sure about the other approach- which seems to be to add one more layer of hierarchy. I also see nothing wrong with it, just don’t have any experience to let me judge it better.
In both cases it is possible that your data only inform the sum (or other combination) of the two interventions, in which case the likelihood would be a ridge and for a fixed prior, more data would mean wider marginal posterior of the individual parameters (but less uncertainty about the sum) as the ridge (possibly stretching to infinity) of the likelihood gains more weight.
In this case reparametrizing in terms of the sum and proportion of the effects (or via QR decomposition as discussed in the user’s guide) could help with computation (but will not change the inferences you can make).
Does that make sense?
Best of luck with your model!
Thank you so much Martin, the package is an excellent suggestion and provides exactly what I was looking forward. I heard of the PC priors that are used there before but didn’t follow up on it because at the time they weren’t really implemented in Stan. Yet, now they seem very suitable for my task and looking forward to use the makeyprior package to set up the priors for my task.
With regards to your further explanations, it makes complete sense. I am pretty sure that the data can only inform the sum of interventions in my case. But let’s see what I can do with reasonable priors :)
Thanks again for your help!
It sounds to me like there might be structural reasons for your problem that aren’t directly attributable to your prior. Edit: and I think it would be wise to address these issues at the root rather than (only) via the prior.
One possibility is that the two parameters aren’t well identified by virtue the interventions being too tightly correlated in the data.
A second possibility is that the effects of the interventions are not additive. So if
intervetion_1 is already in place, then
intervention_2 has a small effect on the linear predictor, and vice versa. This might potentially be modeled directly by including a multiplicative interaction (expected to be negative) between
intevention_2 has a smaller-than-otherwise effect in countries where
intervention_1 got implemented first, and vice versa.
Thanks for your ideas!
The first possibility is very likely and that is what I want to find out, i.e. whether it is possible to learn something about country-specific effects of the interventions without overfitting the model. For some countries it will impossible as for some of them the interventions were on the very same day.
The effects of the interventions are already multiplicative as the effect is additive on the log scale (log link).
I have been thinking about interactions between interventions as well. It would be interesting to know if an intervention is more effective given another is already in place or not. But I find it difficult to model interactions in this case, as I have seven of them, and thus many possible and plausible combinations of interactions. Any thoughts on this?