I am trying to interpret some coefficients on their natural scale from a
Stan model fit to standardized inputs. The model predicts when plants
are damaged by frost in the spring (0/1 outcome) based on a few climate
predictors, all standardized by subtracting the mean and dividing by 2
SD. I have a mean spring temperature (MST) input and would like to say
‘for every 2 degrees of MST warming, frost damage changes X%’ … my
current calculations for this are:
invlogit(intercept + (coefMST/(sd(MST)*2))*mean(MST)) -
invlogit(intercept + (coefMST/(sd(MST)*2))*(mean(MST)-2)))*100
I am new to these models, especially using standardized inputs, so am
wondering if my conversion is correct or if I need to also back-convert
the intercept or something else … Thanks!
I just scanned the pages from Data Analysis Using Regression and Multilevel/Hierarchical models on this:
inv_logit.pdf (1.1 MB)
Going for this type of interpretation will always be a bit handwavy cause of the non-linearity, but I think the divide by 4 rule is the thing you’re looking for.
With regards to the centering/scaling, I think the reasoning for doing those things stays the same. You shift everything by the mean so that your intercept corresponds to some sort of average and your coefficients correspond to deviations from that average (so the interpretation of the coefficients change with this). You rescale by the standard deviation to make the units of the different parameters comparable. The goal here is to make it easier to figure out what seems important or not in the regression.
Then on top of this you add the non-linear stuff.
1 Like
Yes, those are the pages I have been following! And I agree it’s all
handwavy, but I do want to both compare across predictors (hence the
standardizing) and give readers a better sense of what the numbers mean.
So, just to clarify if I use the divide-by-four rule and from the model
with standardized predictors I have an effect of 0.8 for, say, MST,
would it be 0.2*(SD(MST)*2) to get it back to original units?
The shift by the mean isn’t gonna change the units of things, though it changes the interpretation of zero. Ignoring the mean shift, if you have data X and scaled data \hat{X} = X / \sigma_X and a regression like:
y \sim N(\text{logit}^{-1}(\beta \hat{X}), \sigma)
The question is what the equivalent \beta would be if you had X instead of \hat{X}.
Substitute in \hat{X} = X / \sigma_X and group the terms
y \sim N(\text{logit}^{-1}(\beta X / \sigma_X), \sigma) \\
y \sim N(\text{logit}^{-1}((\beta / \sigma_X) X), \sigma)
In this formulation of the model (which has the same posterior on \beta as the previous model cause it is the same thing), \beta / \sigma_X is the effective coefficient with units of inverse whatever the units of X are.
So you can just take this and divide it by four to get the maximum percentage change per unit of input (\frac{\beta}{4 \sigma_X}).
2 Likes
Great, that makes a lot of sense. Thanks so much for all of your help!
1 Like