Setting priors on specific values of random intercept

jchoiniere · October 14, 2019, 11:04pm

Hello all,

I apologize if this is a very simple question that I’ve failed at finding an answer for, but I’m feeling lost enough that I’m asking for help. I’m also not terribly well-versed in formal statistics, so my terms/language may be a little off.

I’m considering using Stan, specifically through the rstan interface, to replace a mixed model I’d been producing via lme4. Speaking approximately, I’m modeling a binary outcome k with a fixed effect p and a random intercept c, where c is a factor variable with roughly 100 levels. In lme4 terms, it’s

k ~ p + (1|c)

I have multiple years of data, which for domain-specific reasons should each be modeled separately. There’s a lot of year-to-year overlap in the levels of c, but not perfect overlap, and I have reason to believe the distribution of the intercept for a given value of c will be pretty similar year-over-year.

I’m hoping, then, that I can use the modeling results from the first year of data to inform level-specific priors for subsequent years, and use either an uninformative or weakly informative prior when no such level-specific prior exists. I’m at a total loss as to how to actually code that into my .Stan file, though. I see through the “similar topics” suggestions that brms can’t do this, or at least couldn’t as of February, but I was hoping that by using rstan directly it would be doable. So…

Is this even possible?
If so, how does one specify such a prior?

stijn · October 15, 2019, 1:57am

If you are ok with the fact that later years also inform the intercept of earlier years, i.e. the reverse temporal order of the quote, then the multilevel formulation is perfectly serving your purpose. That is if you have a level c_i with only one observation the higher level prior will serve as the weakly informative prior. If you have a level c_j with multiple observations, the same weakly informative prior + all observations from that level will inform the intercept.

If you specifically want to take into account the order of the years, than you will probably have to model the time effects directly with some kind of time series model.

jchoiniere · October 15, 2019, 2:14am

Unfortunately, the forward order is important – it’s reflective of an individual’s skill at something and an aging effect likely exists.

stijn · October 15, 2019, 2:39am

Sorry to bang on about this. Let’s say c_i is the skill for a given person, if you know that a person was more skillful towards the later years, can you then infer that the person was probably also more skillful in the earlier years? If the answer is yes, then you don’t need to overcomplicate the model. You might want to model the aging effects directly. Maybe something like b\ \text{age}_{ij} + c_i. Where you expect b to be negative. You can also model b as a random effect or the aging affect as being non-linear whatever works best for your theoretical understanding.

jchoiniere · October 15, 2019, 2:59am

Not at all! I’m happy for any assistance anyone offers. I’ll have to consider whether that makes sense conceptually – at first blush I’d agree that it’s possible, but I’m not yet sure whether a later change in skill level should lead to questions about earlier skill or if it’s indicative instead of something that’s changed about said person from one year to the next.

In the event I land on the latter, is the model formulation I’m talking about something that’s possible in the code?

stijn · October 15, 2019, 10:16am

I think what you literally proposed in your original post is not possible. In a Bayesian analysis if the estimate of one parameter can be influenced by another parameter, the influence goes in the other way as well.

If I understand your problem correctly, you can achieve what you want with a time series like structure where you basically say that observation at time t is a function of the observation at time t-1 and some other stuff. The ctsem package and/or the user guide chapter on time series might help. Feel free to make a new topic, when you make the problem more specific. Other people on discourse will be much more capable to help you with this particular modeling problem.

Charles_Driver · October 15, 2019, 12:19pm

In any snapshot of a developmental process, it’s likely that there are a) initial and or stable differences between individuals, and b) individual differences in change / fluctuations at any particular time. Both are uncertain, and will be best estimated by fitting the entire data using an appropriate model, which is probably some kind of autoregressive model.

martinmodrak · October 15, 2019, 12:23pm

While I agree with @stijn that you are very likely to be best served by modelling all the years together as a time series (or otherwise), I just wanted to note that restricting the influence to be one way is - at least in principle - possible. The procedure is to fit a parametric distribution to posterior samples and then use this parametric distribution as a prior (I’ve done this in my StanCon 2018 submission , section " Transferring the learned expression to other fits"). It is however tricky and I would discourage it unless you have some very good reasons to do it.

Best of luck with your modelling!

jchoiniere · October 15, 2019, 11:38pm

This has all been extremely informative and helpful – thanks for helping this novice, everyone!

Topic		Replies	Views
Set individual priors for random intercepts? brms	2	469	December 16, 2018
Prior on standard deviation of intercept in random effects model Modeling rstanarm	3	1005	July 3, 2022
A prior on an individual (or many) random effects brms	2	2895	February 21, 2019
How do I put different priors on different levels of a categorical variable in brms? General	2	486	August 29, 2020
Prior information in varying intercepts Modeling prior-choice , hierarchical-model	3	795	August 4, 2021

Setting priors on specific values of random intercept

Related topics