Fit of exponential decay in STAN and PYMC3 -- optimisation, performance, priors

I was playing around with some fits of some radioactive decay using PYMC3 and PySTAN/STAN. I am showing here some dummy data to know our target. The problem is that STAN complains about the model parametrisation but PyMC3 doesn’t and I wonder if I can do better. I would have two questions.

  1. I am a bit confused about the vector and real in STAN. I use here vectors but I have seen the same written with reals and it confuses me. What is the difference? And how would I adjust it if I had multilevel model (i.e. more measurements from different sources which should all follow the same equation)?
  2. I used both Normal and Poisson distribution. I believe that Poisson makes more sense if we consider the nature but gives worse results. I am still quite new to modelling and choosing priors, would anyone have any thoughts/ideas to what to use and what is better?

The simple decay usually follows the following equation: y = a \cdot e^{-k\cdot t}, where k is the kinetic constant, t time, y is the amount of the substance at a given time and a is just some constant.

import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc3 as pm

a = 5
k1 = 1.3
x_test = np.linspace(1,10,20)
y_test = a*np.exp(-k1*x_test)

#plt.plot(x_test, y_test)
#plt.scatter(x_test, y_test)

For PyMC3 and normal priors I used the following code:


# fit the model:
with pm.Model() as test_exp_normal:
    a = pm.Normal("a", 10, 3)
    k1 = pm.Lognormal("k1", 0.5, 0.2)
    lam = a * pm.math.exp(-k1 * x_test)

    obs = pm.Normal("y", lam, observed=y_test)
    trace_test_normal = pm.sample()
    test_prior_normal = pm.sample_prior_predictive()

az.summary(trace_test_normal, kind="stats", round_to=2)

giving me a=9.38 \pm 2.74 and k=1.77 \pm 0.30. The priors look reasonable, the chains are mixed, the predictions look like this (97 % hdi):

Here is the Poisson model for comparison.

Poisson model

# fit the model:
with pm.Model() as test_exp_poisson:
    a = pm.Normal("a", 10, 3)
    k1 = pm.Lognormal("k1", 0.5, 0.2)
    lam = a * pm.math.exp(-k1 * x_test)

    obs = pm.Poisson("y", lam, observed=y_test)
    trace_test_poisson = pm.sample()
    test_prior_poisson = pm.sample_prior_predictive()

And now comes STAN. My code goes as follows and tbh I am not sure whether this is right. As I said, I am still confused by STAN’s vectors and reals since they both can be matrices but for some reason when I tried real, it did not like it. Anyway, this is the STAN code (I will be happy for any suggestions and improvements):

model_simple_exponential_test = """
data {
    int<lower=1> N;           // number of data points, probably not needed but helps
    vector<lower=0>[N] x;           //[N, T]; in case with T
    vector<lower=0>[N] y;      //[N, T]; in case with T

parameters {
  real c;
  real k1;
  real<lower=0> sigma;
  //vector[N] m;

transformed parameters {
  vector[N] m;
  m = c* exp(-k1 * x);

model {
  // priors
    c ~ normal( 10 , 3 );
    k1 ~ lognormal( 0.5 , 0.2 );   
    sigma ~ cauchy(0,5); 

  // likelihood
    y ~ normal(m, sigma);

generated quantities {
  vector[N] ypred;
  vector[N] log_lik;
  for (sample in 1:N) {
    ypred[sample] = normal_rng(c* exp(-k1 * x[sample]), sigma);
    log_lik[sample] = normal_lpdf(y[sample] | c* exp(-k1 * x[sample]), sigma);
and this is how I run it.
import pystan

# Compile the model
stan_model_simple_exp_test = pystan.StanModel(model_code=model_simple_exponential_test)#, verbose=True)

# Data for Stan
data_stan_test = {'N' : x_test.shape[0], #number of datapoints
             'x' : x_test,       #the data
             'y': y_test}          

fit_test_exp = stan_model_simple_exp_test.sampling(data=data_stan_test, 
                    chains=4, iter=2000, warmup=1000, thin=1, init='random', verbose=True, 
                    control = {"adapt_delta":0.95, "stepsize":1, "max_treedepth":10}, n_jobs=-1)

When it finishes, it gives me a lot of warnings about not sufficient sampling:

WARNING:pystan:n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated
WARNING:pystan:Rhat above 1.1 or below 0.9 indicates that the chains very likely have not mixed
WARNING:pystan:1757 of 4000 iterations saturated the maximum tree depth of 10 (43.9 %)
WARNING:pystan:Run again with max_treedepth larger than 10 to avoid saturation

which I again am a bit puzzled why – it should be the same model and priors as for PYMC3, why is it happening? I can play with the sampling parameters (max_treedepth, etc.) but this feels like it needs different care… The fit is perfect, much nicer than the PYMC3, it gives a=5 \pm 9.6e-08 and k=1.3 \pm 1.49e-08.

Thank you for any ideas and suggestions.

You are not using the same model. Your PyMC3 model does not fit the standard deviation sigma of the observation noise whereas your Stan code does. Furthermore, your simulated data are deterministic, i.e. without any observation noise. Deterministic data can be hard to fit as the normal likelihood can diverge at the boundary sigma = 0 in this case.
You might either want to add observation noise to your simulated data or fix sigma = 1 in the Stan code as well and then compare the results again.