Incorporating a Custom Solver for Gaussian Data?

Hello all!

Brand new to stan and HMC and having fun.

I have a spatial model of the form Y|X \sim N(|X|, \Sigma_2) and X \sim N(0, \Sigma_1), where X and Y are of length n and \Sigma_1, \Sigma_2 are fixed n \times n spatial covariance matrices. Upon a single observations of Y, the goal is to sample X|Y.

Here’s my code implementing the sampler:

data {
	int n;
	vector[n] Y;
	matrix[n,n] Sigma1;
	matrix[n,n] Sigma2;
	vector[n] zeros;	
}
parameters {
	vector[n] X;
}
transformed parameters{
	vector[n] absX = fabs(X);
}
model {
	target += multi_normal_lpdf(X | zeros, Sigma1);
	target += multi_normal_lpdf(Y | absX, Sigma2);
}  	

Evaluating the target distribution, a product of two multivariate-normals, requires solving linear systems with covariace matrices \Sigma_1 and \Sigma_2, which gets prohibitive for large n. In spatial statistics, we like to assume some computationally amenable form for the covariance matrices that allow for a fancy algorithm to efficiently solve the linear systems.

Is there a good way to incorporate a fancy solver into my stan implementation? Basically, I’d like the program to evaluate ‘multi_normal_pdf()’ in a special, efficient way. For example, could I write a seperate C++ function that stan would call to evaluate ‘my_special_multi_normal_pdf()’?

Thanks,
Paul

2 Likes

Note that for the model you describe, large speedups can likely be obtained with multi_normal_cholesky_lpdf and the cholesky_factor_cov data type.

Also, the absolute value in vector[n] absX = fabs(X); will almost certainly break the sampler. Instead, you should use constraints, i.e.:

parameters {
	vector<lower=0>[n] absX;
}

Yes, you can, see the C++ vignette Interfacing with External C++ Code • rstan for some very rudimentary guidance (similar stuff is also possible with other interfaces, but it can be a bit more complicated). Remember however, than Stan needs both the value AND the gradient of the function, so most external libraries can’t be directly plugged in. (you either need to compute over Stan’s var type or - better - compute the gradient analytically and use class precomputed_gradients_vari or adj_jac_apply.

Best of luck, good use of C++ can bring you speed ups, but it is definitely not straightforward :-(

Thank you for your help!

I tried using multi_normal_cholesky_lpdf, both by passing the Cholesky decomposition into the Stan program from ‘R’ and by calculating the decomposition within the transformed data section of the Stan program, using cholesky_decompose. It was tremendously slower. But I guess I need to use the cholesky_factor_cov data type?

I need to have absX maintain it’s relationship with X, which is the random variable I am trying to sample, so I can’t define a new free variable absX. Both X and |X| appear in my posterior.

Paul

I don’t think this should be important unless the covariance matrix is unknown, but I am not 100% sure, this is at the very edge of my understanding of multivariete normals in Stan. Are you speaking about the exact model you posted or about some larger model?

This is very likely a problem (might be even source of the performance issues). To work well Stan requires the posterior to be smooth. The absolute values introduces a discontinuity into the gradient which in most cases causes trouble for the sampler. Maybe you could post the full model to let us asses if that might be the problem and how could it be modified to work better?