Trying to specify intercept only model

Longshot408 · December 19, 2019, 8:37pm

Hello all,
Very simple question on creating an intercept-only model: How do I do it? I would have thought my code below would work, but it is returning an error.

    > InterceptModel <- stan_glm(
+   Accept_Reject ~ 1,
+   data = pubdata, 
+   family = binomial(link = "logit"), 
+   prior_intercept = NULL,
+   cores=3,
+   QR = FALSE,
+   chains = 3, iter = 50000,
+   diagnostic_file=file.path(tempdir(), "df.csv"))

bridge4<-bridge_sampler(InterceptModel)

Error in read.table(file = file, header = header, sep = sep, quote = quote, :
no lines available in input

describe_posterior(InterceptModel)

Error in dat[, -1, drop = FALSE] : incorrect number of dimensions

bgoodri · December 20, 2019, 2:11am

I think stan_glm is working but bridge_sampler is returning an error, but I’m not sure why. Does file.path(tempdir(), "df.csv") exist and have values?

Longshot408 · December 20, 2019, 2:20am

Not sure to be honest, as I don’t even understand that line of code. I got it from your suggestion actually, in this post I made a while ago. I’ve been sticking it in every model I’ve been running so that I can compare all the models with Bayes Factors later with performance::compare_performance() and bayestestR::bayesfactor_models()

So far I’ve copied and pasted that one same line into three other models in my RMarkdown file and all of them work fine; its just this 4th, intercept-only model that breaks.

bgoodri · December 20, 2019, 2:30am

I can’t replicate this error with

library(rstanarm)
post <- stan_glm(cbind(incidence, size - incidence) ~ 1, data = lme4::cbpp, 
                 family = binomial, refresh = 0, 
                 diagnostic_file=file.path(tempdir(), "df.csv"))
library(bridgesampler)
bridge_post <-bridge_sampler(post)

Can you?

Longshot408 · December 20, 2019, 2:42am

Just ran that and no, I couldn’t repro it either.
After running your “post” model and my intercept model though, there does appear to be some differences in the model summaries, before you even look at the bridge sampler…

my model:
summary(InterceptModel)

Model Info:
 function:     stan_glm
 family:       binomial [logit]
 formula:      Accept_Reject ~ 1
 algorithm:    sampling
 sample:       75000 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 610
 predictors:   1

Estimates:
          mean   sd   10%   50%   90%
(Intercept) 0.0    0.0  0.0   0.0   0.0  

Fit Diagnostics:
       mean   sd   10%   50%   90%
mean_PPD 0.0    0.0  0.0   0.0   0.0  

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

MCMC diagnostics
          mcse Rhat n_eff
(Intercept)   NaN  NaN  NaN  
mean_PPD      NaN  NaN  NaN  
log-posterior NaN  NaN  NaN

vs. your model…
summary(post)

Model Info:
 function:     stan_glm
 family:       binomial [logit]
 formula:      cbind(incidence, size - incidence) ~ 1
 algorithm:    sampling
 sample:       4000 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 56
 predictors:   1

Estimates:
              mean   sd   10%   50%   90%
(Intercept) -2.0    0.1 -2.2  -2.0  -1.9 

Fit Diagnostics:
           mean   sd   10%   50%   90%
mean_PPD 1.8    0.2  1.5   1.8   2.1  

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

MCMC diagnostics
              mcse Rhat n_eff
(Intercept)   0.0  1.0  1537 
mean_PPD      0.0  1.0  2285 
log-posterior 0.0  1.0   975

Not sure what all those NaN’s mean in mine. Might they have something to do with it? Did mine not converge or something?

bgoodri · December 20, 2019, 3:45am

Yours is definitely messed up. What version of rstanarm are you using? Does

post <- stan_glm(switch ~ 1, data = wells, family = binomial, refresh = 0, 
                 diagnostic_file=file.path(tempdir(), "df.csv"))

work?

Longshot408 · December 20, 2019, 9:14pm

Yep, that code works too:

> post <- stan_glm(switch ~ 1, data = wells, family = binomial, refresh = 0, 
+                  diagnostic_file=file.path(tempdir(), "df.csv"))
> summary(post)

Model Info:
 function:     stan_glm
 family:       binomial [logit]
 formula:      switch ~ 1
 algorithm:    sampling
 sample:       4000 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 3020
 predictors:   1

Estimates:
              mean   sd    10%   50%   90%
(Intercept)  -0.2    9.5 -12.5  -0.2  12.0

Fit Diagnostics:
           mean   sd   10%   50%   90%
mean_PPD 0.5    0.5  0.0   0.5   1.0  

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

MCMC diagnostics
              mcse Rhat n_eff
(Intercept)   0.2  1.0  1605 
mean_PPD      0.0  1.0  2285 
log-posterior 0.0  1.0  1285

And btw I’m running rstanarm ver 2.19.2

Longshot408 · December 20, 2019, 9:37pm

Running diagnostics on this intercept model produced some interesting results.

AutoCorr_interceptModel=plot(InterceptModel, "acf", pars = "(Intercept)")
> plot(AutoCorr_interceptModel)
Warning messages:
1: Removed 63 rows containing missing values (geom_segment). 
2: Removed 63 rows containing missing values (geom_path). 

> heidel.diag(InterceptModel)
                                          
            Stationarity start     p-value
            test         iteration        
(Intercept) failed       NA        NA     
                                    
            Halfwidth Mean Halfwidth
            test                    
(Intercept) <NA>      NA   NA 

> trace_int <- plot(InterceptModel, "trace", pars = "(Intercept)")
> plot(trace_int)

(upload://8jFutmXqcnnisS1QIqtC9QtZXdw.png)

Ara_Winter · December 20, 2019, 10:35pm

So running with fake data

InterceptModel <- stan_glm(y ~ 1, data = df, family = binomial(link = "logit"), 
                           prior_intercept = NULL,
                           cores=6,
                           QR = FALSE,
                           chains = 4, iter = 4000, 
                           diagnostic_file=file.path(tempdir(), "df.csv"))

> bridge4<-bridge_sampler(InterceptModel)
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4

> bridge4
Bridge sampling estimate of the log marginal likelihood: -67.96922
Estimate obtained in 4 iteration(s) via method "normal".

Not sure if that is going to help.

Longshot408 · December 21, 2019, 2:45pm

Nope; copy-paste of this code and replacing “y” with my DV and “df” with my data repro’d the same errors as above

Ara_Winter · December 21, 2019, 6:14pm

Hmm ok. Let me get back to fiddling with it then.

Ara_Winter · December 21, 2019, 6:24pm

Oh which bridge_sampler are you using? There is one in the brms package and another in the bridgesampling package.

Longshot408 · December 21, 2019, 6:38pm

I’m using the latter

Ara_Winter · December 21, 2019, 7:06pm

Ok! I am using the brms one. That might be part of the difference.

MegBallard · December 22, 2019, 8:10am

Before going deeper, did you run it with built in data, just as was posted.

I think the problem may be that you are running an intercept only model with no prior on the intercept (I made some random binomial data and did an intercept only model with prior=NULL and got the same strange output you have.

Topic		Replies	Views
Trying to get a Bayes Factor from an rstanarm glm Modeling specification , bayes-factor	7	2065	November 19, 2019
Bridgesampling with rstanarm? rstanarm marginal-likelihood , bayes-factor	12	3383	March 29, 2018
Intercept only model rstanarm rstan	7	1657	August 26, 2020
Issue with bridgesampling::bridge_sampler() Modeling rstan , loo	1	43	April 3, 2025
Compute Bayes factor from cmdstan output using bridgesampling package (error message) Modeling cmdstan , techniques , bayes-factor	4	1358	April 16, 2021

Trying to specify intercept only model

Related topics