Hi there,

I’m playing with the idea of logspline density estimation in Stan. See these notes (http://www.econ.uiuc.edu/~roger/courses/574/lectures/L18.pdf) for a reasonably non-technical explanation. The idea is that given a sample X = \left(X_1, \ldots, X_n\right), we estimate their density f by exponentiating a linear combination of B-spline basis functions:

f(x) = \frac{\exp{\sum_{j=1}^P \theta_j B_j(t)}}{c(\theta)},

with normalizing constant c(\theta) = \int \exp{\sum_{j=1}^P \theta_j B_j(t)}\mathrm{d}t. Presumably the Stan code would be something like this:

```
data{
int n; //Size of X-sample
int P; //Dimensionality of spline basis
matrix[n, P] B; //Basis function evaluations at sample X-points
parameters{
vector[P] theta; //Spline coefficients
... //Other smoothing stuff
}
model {
target += B*theta - log(c);
... //Suitable priors for theta and other smoothing stuff
}
```

The challenge is with \log(c(\theta)). Although it is a “normalizing constant” for the density, it is *not* constant with respect to the parameters \theta and so I think it must be included in the log posterior. However, it doesn’t have a closed form - presumably it should therefore be estimated numerically, but I’m not sure if there’s a way to do this efficiently. In particular I imagine the autodiff of the numerical approximation would be a mess.

Does anyone have any experience with this kind of model in Stan - or any kind of density estimation involving some normalizing function? Any insight is greatly appreciated.