I’ve been playing around with fitting a model using brms.
But, despite reading the vignette on “family” and “link” here over and over again,
I am still not perfectly comfortable with the concepts. And, more precisely the added value of choosing these well.
So, if I have to formulate my doubts into questions, these would be:
How does choosing the family influence the fitted model?
How strong is the influence of the family on the fitted model? Can a poorly chosen family still result in a model which predicts reasonably well? If so, why?
What is the added value of link? How does it contribute to the model being fitted?
I know my questions are probably very basic, but I’ve been trying to get comfortable with the purpose of family and link, but I still can’t quite picture how exactly they work, thus making it difficult for me to use them comfortably.
I had to read up on this the other day to make sure my wording was correct. I typically start with a lm or glm since it’s a bit easier for me. You have a link function, a distribution, and a predictor. Your link is was what connects your predictor to the expected distribution of the response.
For a out of the box glm your distribution is normal, your link is Gaussian, and you have something like y ~ x for the response and predictor.
I hope that was what you were asking about.
Thank you for the reply!
This part I kind of understand:
But what still somehow puzzles me is why we need something to connect the predictor and the assumed response distribution.
I think that’s due to our predictors in a glm can come from non-normal or categorical distributions.
Someone a bit better with words here about why:
And Chapter 16 in Bayesian Data Analysis by Gelman et al. goes into some nice and readable details about why you use the link function.
I think it’s easiest to see this taking logistic regression as an example (same applies for Gaussian but not as obvious that it’s necessary because for linear regression the link function is just the identity function). For a logistic regression we want to estimate the association between predictors X and the probability of y=1 (“success”). But the “linear predictor” alpha + beta * X doesn’t always give us a valid probability (could be any real number), so we use the inverse logit function to transform those values to be between 0 and 1.
The terminology is a bit confusing because the function for transforming alpha + X * beta is actually called the inverse link function. In general:
link function: constrained -> unconstrained (e.g. logit function for logistic regression)
inverse link function: unconstrained -> constrained (e.g. inverse logit function for logistic regression).
Hope that helps!
It might not be the most serious tutorial out there, but I wrote a four-part introduction to GLMs and Stan on my blog. One key reason behind writing it was to go through a worked example myself as a result of a lot of self-study of Stan and GLMs, so it starts from a much lower level than a lot of other tutorials, and tries to bring you along with it.
The language is pastiche HP Lovecraft, so apologies if it makes it a little harder to read!
You can read the series here: