Trouble restricting samples to positive values

ziqing-ang · June 27, 2020, 3:19am

Hi all. I am trying to fit some data generated using an exponential function with added noises using PyStan. Attached below is the Stan code as well as the dummy data in the Python code I’ve been using for fitting purposes.

exp_func_fit =  '''
data {

int<lower=0> N; 
real<lower=0> t[N];
real<lower=0> y[N];

}

transformed data{}

parameters{
real theta;
real sigma;

}

transformed parameters{
real<lower=0> e_func[N];
for (n in 1:N) {       
    e_func[n] = 1 - exp(-(theta * t[n]));

}

}

model{
theta ~ normal(1, 0.05);
sigma ~ normal(0.2, 0.02);
y ~ normal(e_func, sigma);
}

generated quantities {

// Simulate data from posterior predictive distribution
//real<lower=0> y_tilde[N];
real y_tilde[N];
y_tilde[1] = 0;
for (n in 2:N) {
    y_tilde[n] = normal_rng((1 - exp(-(theta * t[n]))), sigma);
}

}

'''

sm = stan.StanModel(model_code=exp_func_fit)
pickle.dump(sm, open('exp_func_fit.pkl','wb'))

sm = pickle.load(open('exp_func_fit.pkl', 'rb'))
N = 21
t = [(x * 0.5) for x in range(0, N*2)] 
A = np.array([0.        , 0.35644848, 0.58936862, 0.70920895, 0.54697252,
       0.8042943 , 0.80233711, 0.99545308, 1.54664198, 1.42830176,
       0.79185403, 0.85867071, 1.1133023 , 0.87957971, 1.17921721,
       0.86249417, 0.60621984, 0.83061225, 0.98675397, 0.82198135,
       1.18468728, 1.09527567, 0.85214074, 1.11789476, 0.8759278 ,
       1.00827872, 0.68980416, 0.85304309, 1.15551683, 1.33109103,
       1.06333705, 0.98227956, 0.73697556, 1.28526364, 1.21358908,
       1.14064481, 0.98691966, 0.92229333, 0.61856001, 1.25956421,
       0.43418382, 0.9630146 ])
sam_fit_func = dict(N=len(t),t=t,y=A)
fit = sm.sampling(data=sam_fit_func, seed=12345)

The result of the sampling is as shown below, with the scatter plot being the data, the navy plots representing part of the samples and the other lines in the middle as the posterior means of the 4 chains:

I have mainly two issues here:

Suppose I know that the data can’t be less than zero. If I set real<lower=0> y_tilde[N]; in the generated quantities block, I get thrown some error indicating
"RuntimeError: Exception: validate transformed params: y_tilde[i_0__] is -0.778627, but must be greater than or equal to 0 (in ‘unknown file name’ at line 36)
"
and it just get stuck. While it has not caused anything to break but even the restriction in the transformed parameters spewed some exception error in the terminal. Is this just simply a restriction issue or I have to sort of improve the model?
Since I know at point 0 I will get y=0 so I’ve hardcoded it to be zero in the samples as well. Is there a better way to do this since this has messed up the sampling and I get complaints like:
“WARNING:pystan:n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated
WARNING:pystan:Rhat above 1.1 or below 0.9 indicates that the chains very likely have not mixed”
but I know it’s where t=0 that’s causing the problem and other parameters have converged well.

I thank you for your help in advance!

andre.pfeuffer · June 27, 2020, 11:20am

Sure you are fitting a normal distribution, thus values can get negative.

You may fit a

real exponential_cdf (reals y, reals beta)
The exponential cumulative distribution function of y given inverse scale beta

For sampling you may use the inverse CDF method.

ziqing-ang · June 27, 2020, 1:13pm

Hi, thanks for the reply!

Does this still apply if the data is a time series?

Topic		Replies	Views
Stan doesn't like (near-) zeros? Modeling pystan	10	4135	November 17, 2021
Plotting LLH, exponential modelling and best practices Modeling pystan , fitting-issues	3	1329	June 14, 2020
Same code, different machines –> very different results. Why? Modeling fitting-issues	7	894	August 8, 2019
Using a custom likelihood function in pystan Modeling fitting-issues	6	1546	January 24, 2020
Simple ODE model drawing crazy values Modeling fitting-issues , specification	6	754	May 1, 2019

Trouble restricting samples to positive values

Related topics