Pystan converting reals to ints?

I have a variable taking on values greater than 5B. I want to transform it in my Stan model, to estimate some parameters (versus transforming it before including in `model_data`). But when I fit the model, I get the error:

OverflowError: value too large to convert to int

I understand that the maximum value for integers is 2^{31} - 1 ~= 2B. But my variable is a real, so why is Stan converting to int?

Here’s some code to reproduce the problem:

Stan model:

``````data {
int N;
real y[N];
}

parameters {
real theta;
real<lower=0> sigma;
}

model {
y ~ normal(theta,sigma);
}
``````

Fit the model:

``````sm = pystan.StanModel(model_code=model_code, verbose=False)
model_data = {
'N': N,
'y': y,
}
sm_fit = sm.sampling(data=model_data, iter=1000, chains=4)
``````

Here’s a histogram of my data:

If I generate data on a similar scale, I don’t get the same error:

``````N = 100
np.random.seed(123)
y = np.random.exponential(scale=1e11, size=N)
``````

With this data, running `sm.sampling` works fine.

So it seems the problem is the variance in the data, as opposed to the scale.

To check this, I was indeed able to recreate the error using:

``````model_data = {
'N': 10,
'y': [500,1000,50000000000,10000,1,2,3,4,5,6]
}
``````

So does Pystan convert reals to ints when the data has very large variance?

I’m using PyStan 2.19.1.1 in a Python notebook on Databricks.

In the toy example you provide, does the same error occur if you use `50000000000.0` instead of `50000000000`?

Note also that Pystan 2 is no longer being maintained

1 Like

Ah, that does solve the error.

And returning to my original data, converting to float64 also solves the problem.

My guess is that PyStan2 is making some assumption that a Python `int` is convertible to a 64-bit integer under the hood, but this is not true (Python uses by default what some other languages call "BigInt"s that can be variable sizes)