Negative numbers causes Bayesian Model to fail

Shy · July 21, 2017, 4:21pm

Hi everyone!

I am running a Bayesian Hierarchical model and one of my explanatory variables has multiple negative numbers in the data. When I run my rstan model, I receive an error saying it produced NaNs, and I know rstan does not like NA values in data.

Is there any way to still run my model with negative values, or is there a way to transform my variable without losing actual data?

Thank you to all who replies!

jjramsey · July 21, 2017, 5:54pm

It would help to know what your model is. For example, if a model were to raise a number to a non-integral power, that would cause a problem. There are of course other ways that a negative number would be a problem.

Shy · July 21, 2017, 6:09pm

Hi jjramsey,

Thank you for the information on C++. My model uses a log transformation is to make the distribution mostly normal. As shown in the examples http://www.cplusplus.com/reference/cmath/log/, some of my x values are negative, which gives the similar error:

input.to.stan ← stan.input()
Warning message:
In log(in.data[, y.col]) : NaNs produced

Because of this, my fit1 and fit2 do not run due to this error:

fit1 ← stan(model_code=input.to.stan$model, data=input.to.stan$data,
init=input.to.stan$inits,chain=0)
Error in FUN(X[[i]], …) : Stan does not support NA (in y) in data
failed to preprocess the data; sampling not done

Due to this problem, is there any other method to run my model modified with the same negative values?
I hope this helps!

anon75146577 · July 21, 2017, 6:58pm

In that case, you can’t use a log transform of your data (a requirement for log-transforms to work is that the data is strictly non-negative).

There are some other options you could use, like log( const + data), where const is big enough to make the sum non-negative, but that’s hard to justify.

It might be easier to model the data on its natural scale and work out a sensible (non-normal) likelihood.

Shy · July 21, 2017, 7:48pm

Hi Daniel_Simpson,

Thank you for your reply. I did remove the log transformation and modeled my data based on its natural scale and it worked well.
I would like to follow up however, with the advantages of modelling using the log transformation vs natural scale? Other than more normal distribution of my data for log transformations, compared to modelling negative numbers for natural scale, are there any other advantages?

Thank you again!

Bob_Carpenter · July 26, 2017, 3:30pm

You really want to think about the generative process and the noise scale. If you have values that need to be positive for some reason, then you can log transform to an unconstrained scale and often find that the log transform is normal whereas the original values weren’t.

The main difference is that the lognormal has multiplicative error (error is proportional to value), whereas the standard normal has additive error that’s independent of the current value.

Topic		Replies	Views
Initialization failed, initial values rejected Modeling	10	2263	October 17, 2018
Help with non-linear model [Log probability evaluates to log(0)] Modeling rstan	2	278	October 18, 2022
Lognormal_rng return negative values and STAN is silent about it Modeling	2	799	February 17, 2022
Getting an error "... must be nonnegative!" despite constraining variables to positive values with <lower=0>? Modeling rstan , fitting-issues	6	407	September 27, 2023
Model starts with some initial values rejected Modeling	6	1106	January 10, 2018

Negative numbers causes Bayesian Model to fail

Related Topics