Negative numbers causes Bayesian Model to fail

Shy · July 21, 2017, 4:21pm

Hi everyone!

I am running a Bayesian Hierarchical model and one of my explanatory variables has multiple negative numbers in the data. When I run my rstan model, I receive an error saying it produced NaNs, and I know rstan does not like NA values in data.

Is there any way to still run my model with negative values, or is there a way to transform my variable without losing actual data?

Thank you to all who replies!

jjramsey · July 21, 2017, 5:54pm

It would help to know what your model is. For example, if a model were to raise a number to a non-integral power, that would cause a problem. There are of course other ways that a negative number would be a problem.

Shy · July 21, 2017, 6:09pm

Hi jjramsey,

Thank you for the information on C++. My model uses a log transformation is to make the distribution mostly normal. As shown in the examples http://www.cplusplus.com/reference/cmath/log/, some of my x values are negative, which gives the similar error:

input.to.stan ← stan.input()
Warning message:
In log(in.data[, y.col]) : NaNs produced

Because of this, my fit1 and fit2 do not run due to this error:

fit1 ← stan(model_code=input.to.stan$model, data=input.to.stan$data,
init=input.to.stan$inits,chain=0)
Error in FUN(X[[i]], …) : Stan does not support NA (in y) in data
failed to preprocess the data; sampling not done

Due to this problem, is there any other method to run my model modified with the same negative values?
I hope this helps!

anon75146577 · July 21, 2017, 6:58pm

In that case, you can’t use a log transform of your data (a requirement for log-transforms to work is that the data is strictly non-negative).

There are some other options you could use, like log( const + data), where const is big enough to make the sum non-negative, but that’s hard to justify.

It might be easier to model the data on its natural scale and work out a sensible (non-normal) likelihood.

Shy · July 21, 2017, 7:48pm

Hi Daniel_Simpson,

Thank you for your reply. I did remove the log transformation and modeled my data based on its natural scale and it worked well.
I would like to follow up however, with the advantages of modelling using the log transformation vs natural scale? Other than more normal distribution of my data for log transformations, compared to modelling negative numbers for natural scale, are there any other advantages?

Thank you again!

Bob_Carpenter · July 26, 2017, 3:30pm

You really want to think about the generative process and the noise scale. If you have values that need to be positive for some reason, then you can log transform to an unconstrained scale and often find that the log transform is normal whereas the original values weren’t.

The main difference is that the lognormal has multiplicative error (error is proportional to value), whereas the standard normal has additive error that’s independent of the current value.

Topic		Replies	Views
Lognormal_rng return negative values and STAN is silent about it Modeling	2	930	February 17, 2022
Help with non-linear model [Log probability evaluates to log(0)] Modeling rstan	2	393	October 18, 2022
Chain 1: Log probability evaluates to log(0), i.e. negative infinity. What is causing this error? Modeling stan	1	1174	July 15, 2022
Initialization failed, initial values rejected Modeling	10	2442	October 17, 2018
Stan doesn't like (near-) zeros? Modeling pystan	10	4215	November 17, 2021

Negative numbers causes Bayesian Model to fail

Related topics