# Fitting a non-linear growth model with stan

Hi all,

I’m trying to fit a non-linear growth model to some data with Stan (R code below). It seems pretty straightforward but I can’t get reasonable parameter estimates. Also I’m getting warning messages about divergent transitions and exceedance of the tree-depth.
I would greatly appreciate some advice.

Thanks,
Maarten

``````library(rstan)
dat <- list(
"N" = 14,
"x" = c(18, 19, 33, 39, 41, 41, 42, 44, 48, 57, 80, 81, 88, 93),
"Y" = c(0.80, 1.16, 1.55, 2.22, 2.76, 4.63, 5.41, 6.93, 4.55, 5.17, 6.42, 6.78, 8.92, 5.99)
)

nlm <- nls(Y ~ a*(1-exp(-b*x))^c, data = dat,
start = list(a = 1, b = 0.01, c = 1))
plot(dat\$x, dat\$Y)
lines(dat\$x, predict(nlm))
print(nlm)

stanmodel <- "
data {
int<lower=0> N;
real<lower=0> x[N];
real<lower=0> Y[N];
}

parameters {
real alpha;
real beta;
real gamma;
real<lower=0> tau;
}

transformed parameters {
real<lower=0> sigma;
real<lower=0> m[N];
real<lower=0> a;
real<lower=0> b;
real<lower=0> c;
a = exp(alpha);
b = exp(beta);
c = exp(gamma);
for (i in 1:N)
m[i] = a*(1-exp(-b*x[i]))^c;
sigma = 1 / sqrt(tau);
}

model {
// priors
alpha ~ normal(0.0, 1000);
beta ~ normal(0.0, 1000);
gamma ~ normal(0.0, 1000);
tau ~ gamma(.0001, .0001);
Y ~ normal(m, sigma); // likelihood
}

generated quantities{
real Y_mean[N];
real Y_pred[N];
for(i in 1:N){
Y_mean[i] = a*(1-exp(-b*x[i]))^c;  // Posterior parameter distribution of the mean
Y_pred[i] = normal_rng(Y_mean[i], sigma);  // Posterior predictive distribution
}
}
"

rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
fit <- stan(model_code = stanmodel,
model_name = "GrowthCurve",
data = dat)

print(fit)

``````
1 Like

I am not expert, but I try to give my 2 cents.

seems to not be justified by your data, and in Stan is usually discouraged as too wide prior.

Are you sure you need the c paramter? There are a lot of degrees of freedom, here the parameters a and c play overlapping roles (maybe simplifying the link function could help). Is your model related to this?

If so can you confirm that you intend to use 1 - and non 1+ in your equation?

Why are you transforming these parameters with exp?

Furthermore you can read some non-identifiability problems from another post GLM with growth function - problem of indetermiinability when slope = 0

After some playing around I managed to get better constraint on the parameters with the following modifications:

1. I removed the log-transformation. I previously included this because all parameters are >0 but for some reason this cause problems with the sampling.
2. I narrowed the priors
3. I accounted for heteroscedasticity by assuming that sigma scales linearly with predicted mu. The scaling factor is estimated in the fitting, rather than sigma itself. The previous version assumed a constant distribution of residuals. My guess is that sigma was constrained at large values due to the large spread in the later part of the data (high x). As a result parameters b and c, which are mostly important for the early part of the curve became poorly constrained.

thanks again!

No worries,

I am thinking that using normal + heteroskedastic noise might not be the best choice ( or maybe it is ;) )

There are processes such as based on Gamma or Count based distributions that can explain naturally this effect.

Can I ask you how your Y looks like? Is it counts, continuous positive, or something else?

In my experience with non-linear models you also have to be thoughtful about specifying starting values in order to avoid divergent transitions.

~Colin

Hi,

I’m trying to plot growth model similar in Stan, with 3 chain and 2000 iterations
Y_mean Posterior parameter distribution of the mean
Y_pred Posterior predictive distribution

Y_mean <- extract(fitstan, “Y_mean”)
Y_mean_cred <- apply(Y_mean\$Y_mean, 2, quantile, c(0.05, 0.95))
Y_mean_mean <- apply(Y_mean\$Y_mean, 2, mean)

Y_pred <- extract(fitstan, “Y_pred”)
Y_pred_cred <- apply(Y_pred\$Y_pred, 2, quantile, c(0.05, 0.95))
Y_pred_mean <- apply(Y_pred\$Y_pred, 2, mean)
In plot…
plot(L ~ x, xlab=“x”, ylab=“Y”, main=“Non-linear Growth Curve”)
lines(x, Y_mean_mean)
points(x, Y_pred_mean, pch=19)
lines(x, Y_mean_cred[1,], col=4)
lines(x, Y_mean_cred[2,], col=4)

Gave me multiple curves of Y_mean_mean, Y_pred_mean,
I want to know how to plot only the mean of all
Thanks!