G’day Stanimals,
I am wondering if someone can help me with the following problem.
I have the following simple Stan model
data {
int<lower=1> Npoints;
int<lower=1> m;
matrix[Npoints,m] i_spline_basis_evals;
vector[Npoints] y;
}
parameters {
vector<lower=0>[m] gammas;
real<lower=0> sigma;
}
model {
gammas ~ normal(0, 2);
sigma ~ cauchy(0,1);
y ~ normal(i_spline_basis_evals*gammas, sigma);
}
When I run this on the following synthetic dataset:
library(splines2)
set.seed(42)
tmin <- 0
tmax <- 10
Npoints <- 100
ninterior_knots <- 5
sigma_noise <- .1
times <- seq(tmin,tmax,length.out = Npoints)
knots <- seq(tmin, tmax, length.out = ninterior_knots+2)[2:ninterior_knots+1]
nknots <- length(knots)
mspline_degree<- 3
i_spline_basis_evals <- iSpline(times, knots=knots, degree=mspline_degree+1,
intercept=TRUE)
m_spline_basis_evals <- deriv(i_spline_basis_evals)
nbasis <- dim(m_spline_basis_evals)[2]
gammas <- abs(rnorm(nbasis, 0, 1))
y_mean <- as.vector(i_spline_basis_evals %*% as.matrix(gammas))
y = rnorm(Npoints, y_mean, sigma_noise)
the data looks like that
Note that here I use the so-called i-spline basis functions, which have the property that they are non-decreasing and therefore any convex linear combination inherits this property, too (which is essential for another application I am working on, however here I use a simplified model to highlight the actual problem).
Now when I run
stan_data <- list(
Npoints = Npoints,
m = nbasis,
i_spline_basis_evals=i_spline_basis_evals,
y=y
)
fit <- sampling(sm, data=stan_data, seed=42, chains=4, cores=2, iter=4000)
and inspect the results I observe for the coefficients gammas
the following:
This looks suspicious to me! So I did a web-search and found that people usually suggest to use sum-to-zero constraint for the coefficients in such settings. However in my particular case I don’t see that this would do the job, since the gammas
need to be non-negative (to guarantee that the linear combination of the spline-basis functions in non-decreasing).
I know that I can adjust parameters like adapt_delta
and max_treedepth
to get rid of the divergences, but I still think the model is sub-optimal and there should be another way to get around this problem / parametrise the model better… Any idea?
Cheers,
Eren
Addendum:
> print(fit, pars=sprintf("gammas[%d]",1:nbasis), digits=5)
Inference for Stan model: toy_model_ispline.
4 chains, each with iter=4000; warmup=2000; thin=1;
post-warmup draws per chain=2000, total post-warmup draws=8000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
gammas[1] 1.32603 0.00201 0.09157 1.15718 1.26174 1.32353 1.38789 1.50739 2076 1.00232
gammas[2] 0.59840 0.00505 0.18640 0.20282 0.47159 0.61565 0.73499 0.91839 1363 1.00404
gammas[3] 0.30071 0.00676 0.20332 0.01605 0.13805 0.26928 0.42530 0.76952 905 1.00553
gammas[4] 0.79245 0.00683 0.19371 0.33493 0.67583 0.81855 0.93443 1.09339 803 1.00512
gammas[5] 0.26692 0.00477 0.16055 0.01720 0.13876 0.25191 0.37577 0.61341 1133 1.00282
gammas[6] 0.27278 0.00321 0.16136 0.01507 0.14216 0.26341 0.38838 0.59912 2524 1.00033
gammas[7] 1.27305 0.00380 0.20059 0.83702 1.14307 1.29261 1.42352 1.60679 2787 0.99993
gammas[8] 0.33318 0.00380 0.19969 0.01880 0.18066 0.31568 0.45781 0.77493 2760 0.99994
gammas[9] 1.86549 0.00279 0.16263 1.53046 1.75967 1.87166 1.97916 2.16111 3406 0.99973
Samples were drawn using NUTS(diag_e) at Sat Oct 20 15:10:08 2018.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
whereas the “true” values for in gammas
are
[1] 1.37095845 0.56469817 0.36312841 0.63286260 0.40426832 0.10612452 1.51152200 0.09465904 2.01842371