Rstanarm please report bug

Please also provide the following information in addition to your question:

  • Operating System: Windows 2016 Server
  • rstanarm Version: 2.18.2

My mixed effect linear regression model has an outcome variable that is strictly positive.

I am using the weakly informative t_prior.

I have added the family = gaussian(link = ‘log’) to the model arguments.

t_prior <- student_t(df = 7,
location = 0,
scale = 2.5)

StandardizedOME.1e4.stan_glmer <- stan_glmer(
I(StandardizedOME + 1) ~ z.Age + z.AnesthesiaDuration + AnesthesiaTechniqueBlock +
AnesthesiaTechniqueGeneral + AnesthesiaTechniqueNeuraxial + o.ASAClass +
EmergencyStatusYN + Race + Sex + REMI + NonOpioidAnalgesicsCount +
o.AIM1Year + CPTBucket + (1 | MPOGInstitutionID),
data = AIM1Small.1e4.df,
family = gaussian(link = ‘log’),
prior = t_prior,
prior_intercept = t_prior,
chains = 4,
cores = 4
)

Error messages are returned at the start of sampling:

SAMPLING FOR MODEL ‘continuous’ NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[1] is inf, but must be finite! (in ‘model_continuous’ at line 170)

The model estimation terminates after 100 attempts.

The following text is written to console:

some chains had errors; consider specifying chains = 1 to debughere are whatever error messages were returned
[[1]]
Stan model ‘continuous’ does not contain samples.

[[2]]
Stan model ‘continuous’ does not contain samples.

[[3]]
Stan model ‘continuous’ does not contain samples.

[[4]]
Stan model ‘continuous’ does not contain samples.

Error in check_stanfit(stanfit) :
Invalid stanfit object produced please report bug
Error in dimnamesGets(x, value) :
invalid dimnames given for “dgCMatrix” object

Are there other arguments that must be set to use the log link in the gaussian family?

Nathan

No, but you may need to pass the init_r argument and set it to something less than its default value of 2 in order to get a narrower range of initial values. It is usually better / easier to model the logarithm of an outcome with the default identity link than to model the raw outcome with a log link.

Thanks for the quick reply.

I did as you advised (identity link of log(outcome)). Model ran without difficulty.

I did pp_check. The density of yrep is wider than y.

Another question if you have time.

Which direction do I change the t_prior df to make yrep tighter? Bigger or smaller?

Nathan

Bigger degrees of freedom for the student t distribution implies more concentration at the mean of the distribution, but what that implies for the distribution of the predictions is complicated when the predictors are correlated.

When using log(outcome) in a stan_lm model, I can’t find a rstanarm function that will accept transformation = exp to explore the model in the original metric.

What am I missing?

If you do posterior_predict to get the posterior predictions for log(outcome), then you can do exp() of that to get posterior predictions of outcome.

I will follow up with your suggestion.