# Informative prior for highly correlated varying slopes

Hi there,

a bit of background: I am trying to model the effects of seven government interventions on the number of new COVID infections across multiple countries in a semi-mechanistic model. So far, I was able to get a reasonable fit with just population-level effects for the interventions. Now, I want to include country-level slopes for the interventions, but this is challenging because the interventions were implemented in close succession in most countries. As a result, the effects of interventions are negatively correlated (â€śone effect stealing from the otherâ€ť) both at the population- and also at the country-level. Further, the posterior estimate of the between-country standard deviation differs greatly across interventions. However, I have the strong prior belief that one measure should not vary by multiple orders of magnitude more than the other. Nevertheless, it is one of the goals of the analysis to find out if some interventions exhibit more variation between countries than others.

So far, I have modeled the population-level and country-level slopes with a non-centered parametrization using the Cholesky Decomposition for sampling the varying slopes from a multivariate normal. Although I put quite informative priors on the between-country standard deviations `tau ~ student_t(4, 0, 0.04)`, my posterior estimate for two interventions is off the charts: tau = 0.5 [0.3, 0.7] and they are also highly negatively correlated (rho = -0.7). For all other interventions, the posterior estimate is practically zero. It is very unlikely that just these two interventions should vary that much. Besides, the model doesnâ€™t converge well (the intercept and varying intercept cannot be reliably estimated).

My question: What could be informative priors for tau that incorporate my prior belief that the slope of one intervention does not vary too much more than the the slope from another? At the same time, the prior should not be too informative, so that I can still figure out if some interventions vary more than others.

I had two ideas so far. One was to set a prior on the total variation and then set informative priors on the proportion of the total variation that is explained by each intervention. Yet, I couldnâ€™t figure out how to do this properly.
Another idea was to do something like this

``````tau ~ student_t(4, 0, omega);
omega ~ half-normal(0, 0.1);
``````

Does it make sense to pursue one of these ideas further or am I overlooking a better approach? Maybe also the data just doesnâ€™t allow it to estimate varying slopes reliably in this case, which is a bit my feelingâ€¦

I would be very interested to hear your ideas!

3 Likes

I am on a phone, so canâ€™t easily check, but I think `makemyprior` project/package does that and provides some explanations and maybe even Stan code. Feel free to ask for clarifications if the instructions there are not clear/applicable.

I am a bit less sure about the other approach- which seems to be to add one more layer of hierarchy. I also see nothing wrong with it, just donâ€™t have any experience to let me judge it better.

In both cases it is possible that your data only inform the sum (or other combination) of the two interventions, in which case the likelihood would be a ridge and for a fixed prior, more data would mean wider marginal posterior of the individual parameters (but less uncertainty about the sum) as the ridge (possibly stretching to infinity) of the likelihood gains more weight.

In this case reparametrizing in terms of the sum and proportion of the effects (or via QR decomposition as discussed in the userâ€™s guide) could help with computation (but will not change the inferences you can make).

Does that make sense?

Best of luck with your model!

3 Likes

Thank you so much Martin, the package is an excellent suggestion and provides exactly what I was looking forward. I heard of the PC priors that are used there before but didnâ€™t follow up on it because at the time they werenâ€™t really implemented in Stan. Yet, now they seem very suitable for my task and looking forward to use the makeyprior package to set up the priors for my task.

With regards to your further explanations, it makes complete sense. I am pretty sure that the data can only inform the sum of interventions in my case. But letâ€™s see what I can do with reasonable priors :)

2 Likes

It sounds to me like there might be structural reasons for your problem that arenâ€™t directly attributable to your prior. Edit: and I think it would be wise to address these issues at the root rather than (only) via the prior.

One possibility is that the two parameters arenâ€™t well identified by virtue the interventions being too tightly correlated in the data.

A second possibility is that the effects of the interventions are not additive. So if `intervetion_1` is already in place, then `intervention_2` has a small effect on the linear predictor, and vice versa. This might potentially be modeled directly by including a multiplicative interaction (expected to be negative) between `intervention_1` and `intervention_2`. Thus, `intevention_2` has a smaller-than-otherwise effect in countries where `intervention_1` got implemented first, and vice versa.

1 Like