Stan_gamm4 output discrepancy


#1

I am having trouble finding the correspondance between gamm4 and stan_gamm4.

For example, if I fit a simple gam model with gamm4, I get the following output:

library(tidyverse)
library(rstanarm)

gamm4(Sepal.Width ~ Sepal.Length + s(Petal.Width), random=~(1|Species), data=iris) %>%
  .$gam %>%
  summary()
Family: gaussian 
Link function: identity 

Formula:
Sepal.Width ~ Sepal.Length + s(Petal.Width)

Parametric coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   1.46144    0.61708   2.368   0.0192 *  
Sepal.Length  0.27311    0.04672   5.846 3.13e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
               edf Ref.df     F  p-value    
s(Petal.Width)   1      1 19.09 2.29e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  -2.64   
lmer.REML = 57.675  Scale est. = 0.07373   n = 150

It shows a “significant” effect (of about .03) for the fixed effect and a “significant” smooth link with low complexity (as shown by the edf column).

When I run it with rstanarm:

fit <- stan_gamm4(Sepal.Width ~ Sepal.Length + s(Petal.Width), random=~(1|Species), data=iris, chains=1)
print(fit)
stan_gamm4
 family:       gaussian [identity]
 formula:      Sepal.Width ~ Sepal.Length + s(Petal.Width)
 observations: 150
------
                 Median MAD_SD
(Intercept)       1.4    0.5  
Sepal.Length      0.3    0.1  
s(Petal.Width).1  0.0    0.1  
s(Petal.Width).2  0.0    0.1  
s(Petal.Width).3  0.0    0.1  
s(Petal.Width).4  0.0    0.1  
s(Petal.Width).5  0.0    0.1  
s(Petal.Width).6  0.0    0.1  
s(Petal.Width).7 -0.1    0.1  
s(Petal.Width).8  0.0    0.1  
s(Petal.Width).9  0.6    0.6  
sigma             0.3    0.0  

Smoothing terms:
                           Median MAD_SD
smooth_sd[s(Petal.Width)1] 0.2    0.1   
smooth_sd[s(Petal.Width)2] 0.5    0.4   

Error terms:
 Groups   Name        Std.Dev.
 Species  (Intercept) 0.76    
 Residual             0.28    
Num. levels: Species 3 

Sample avg. posterior predictive distribution of y:
         Median MAD_SD
mean_PPD 3.1    0.0   

------
For info on the priors used see help('prior_summary.stanreg').

I retrieve in the output the coef fo sepal length (the “fixed” effect) but then, instead of having one smooth term, I have these 9 lines and I’m not sure what they refer to. I haven’t found the info in the docs.

Does anyone have an idea? Thanks a lot!


#2

The output of gamm4::gamm4 and rstanarm::stan_gamm4 no longer correspond that closely, although they are using the same likelihood. What you see from rstanarm::stan_gamm4 corresponds more closely to the gam element of the list produced by gamm4::gamm4. It is best to look at plot_nonlinear or structured manipulations of the data going to posterior_predict in order to analyze the effect of Petal.Width.


#3

I understand. There is no straightforward way to extract an “index of fit” of the smooth term other than a visual inspection of the curve (without playing with posterior_predict which is probably beyond my skills).
Nevertheless, I am curious to know what these 9 components of the smooth term refer to? Thanks for your answer!


#4

There isn’t a frequentist-style single index of fit thing for stan_gamm4. You could use the loo or bridgesampling packages to compare a model with additive terms to a model that is just generalized linear.

The K components (9 is merely the default value) refer to the coefficients in the basis expansion for the nonlinear function being modeled. The details of it very depending on which basis you are using, but the best reference for that is Simon Wood’s textbook.


#5

I am also struggling with the interpretation of stan_gamm4 output. I’m admittedly inexperienced and am used to being able to look at the posterior distributions and if they exclude zero as an indicator of the effect of a model term. In this case aside from just the plot how would you report the results?