Rstanarm please report bug


#1

Please also provide the following information in addition to your question:

  • Operating System: Windows 2016 Server
  • rstanarm Version: 2.18.2

My mixed effect linear regression model has an outcome variable that is strictly positive.

I am using the weakly informative t_prior.

I have added the family = gaussian(link = ‘log’) to the model arguments.

t_prior <- student_t(df = 7,
location = 0,
scale = 2.5)

StandardizedOME.1e4.stan_glmer <- stan_glmer(
I(StandardizedOME + 1) ~ z.Age + z.AnesthesiaDuration + AnesthesiaTechniqueBlock +
AnesthesiaTechniqueGeneral + AnesthesiaTechniqueNeuraxial + o.ASAClass +
EmergencyStatusYN + Race + Sex + REMI + NonOpioidAnalgesicsCount +
o.AIM1Year + CPTBucket + (1 | MPOGInstitutionID),
data = AIM1Small.1e4.df,
family = gaussian(link = ‘log’),
prior = t_prior,
prior_intercept = t_prior,
chains = 4,
cores = 4
)

Error messages are returned at the start of sampling:

SAMPLING FOR MODEL ‘continuous’ NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[1] is inf, but must be finite! (in ‘model_continuous’ at line 170)

The model estimation terminates after 100 attempts.

The following text is written to console:

some chains had errors; consider specifying chains = 1 to debughere are whatever error messages were returned
[[1]]
Stan model ‘continuous’ does not contain samples.

[[2]]
Stan model ‘continuous’ does not contain samples.

[[3]]
Stan model ‘continuous’ does not contain samples.

[[4]]
Stan model ‘continuous’ does not contain samples.

Error in check_stanfit(stanfit) :
Invalid stanfit object produced please report bug
Error in dimnamesGets(x, value) :
invalid dimnames given for “dgCMatrix” object

Are there other arguments that must be set to use the log link in the gaussian family?

Nathan


#2

No, but you may need to pass the init_r argument and set it to something less than its default value of 2 in order to get a narrower range of initial values. It is usually better / easier to model the logarithm of an outcome with the default identity link than to model the raw outcome with a log link.


#3

Thanks for the quick reply.

I did as you advised (identity link of log(outcome)). Model ran without difficulty.

I did pp_check. The density of yrep is wider than y.

Another question if you have time.

Which direction do I change the t_prior df to make yrep tighter? Bigger or smaller?

Nathan


#4

Bigger degrees of freedom for the student t distribution implies more concentration at the mean of the distribution, but what that implies for the distribution of the predictions is complicated when the predictors are correlated.


#5

When using log(outcome) in a stan_lm model, I can’t find a rstanarm function that will accept transformation = exp to explore the model in the original metric.

What am I missing?


#6

If you do posterior_predict to get the posterior predictions for log(outcome), then you can do exp() of that to get posterior predictions of outcome.


#7

I will follow up with your suggestion.