Both f1 and f2 are vectors with length n. I want to model the auto-regression between them: f2 = some function of f1+ error.
It is straightforward to run a linear or polynomial regression: f_2= \mathrm{normal} ( \beta f_1, \tau).

Now I want to make it more flexible so I am using spline: f2 = B_spine (f1) + error.

The spine in stan is very slow for this use. I think the reason is that the spine basis function is reconstructed each time for a new value of f1. In contrast, in a spine regression, the basis function is fixed as long as the covariate X is fixed.

Is spine bad for modeling latent parameters? What is the good alternative (to a slow spine or a linear regression) for this purpose?

Slow for a single log density evaluation or slow because there is a difficult dependency in the posterior and thus there are many log density evaluations per iteration?

This is my (anecdotal) experience with latent splines. I attempted to employ them in this model instead of the time-series, but the number of log-density evaluations was prohibitive.

The difficult posterior in latent variable models is the reason why, for example, INLA and hundreds papers on variational/EP/Laplace inference for GPs exist.

The model I described is slow but eventually fine.

This is the example in which a good adaptation is important. The spline models implies a zero function outside its knotsâ€” but we do not really know the support of the latent function f (the mean function of gp). In other words, there is only a very thin region of the joint space, defined by the span of knots, on which the gradient of the spline coefficients are non-zero. If the adaptation fails to find that region, the HMC update will be very slow.

As a results, it is often the case that multiple chains have very different warm-up time (and very slow) but similar sampling time.

It helps A LOT to initialize the whole model by MAP. I think due to some conceptual debate â€śtypical setâ€ť, we are reluctant to do so in general. Of course, for a linear regression, the hessian of beta is constant and a bad initialization is less harmful. For spline, the sampler has to enter that thin slice of joint space to have non-zero gradient update. I suspect this MAP initialization can be useful for a general type of latent variable models.

Maybe another way to avoid the vague specification of support of f is to use gp to replace B-splines: f_2-f_1= h(f1), h\sim \mathcal{GP}. (f_1f_2 are separate gp latent functions from the likelihood). But then I have to compute a new cov matrix in each iteration.

I am trying to do something similar. Putting aside the fact that you need to deal with boundaries and pre-defined knots of latent variables, I find that even if Iâ€™m doing the simple spline regression between two observed values x and y, simply putting spline evaluation in the model block, not the transformed data (to simulate what would happen for latent variable case) increases runtime by a factor of ~800 in my case, which would mean that evaluating spline itself takes a lot of time.

One primitive solution I thought of is to code the basis functions like here and maybe that will be faster than recursive evaluation.

Did you figure out any further tricks for speeding up?