Gaussian processes: posterior inference on integrated-out latent(s)?

mike-lawrence · March 9, 2023, 9:50pm

So when we have a model like:

data{

	int n_x ;
	array[n_x] real x ; 
	int n_rep ;
	array[n_rep] row_vector[n_x] y ; // observations for a

}
transformed data{
	matrix[n_rep,n_x] y_mat = to_matrix(y) ; // for use in the version that achieves explicit inference on the latent function
}
parameters{

	real<lower=0> noise ;
	real<lower=0> f00_lengthscale ;
	real<lower=0> f00_amplitude ;
	real y_intercept ;
	row_vector[n_x] f00_stdnorm ; // removed in the "integrated-out" version
}

model{

	//priors
	noise ~ weibull(2,1) ;
	f00_lengthscale ~ weibull(2,2) ;
	f00_amplitude ~ weibull(2,1) ;
	y_intercept ~ std_normal() ;
	f00_stdnorm ~ std_normal() ;

	// covariance
	matrix[n_x,n_x] k00 = f00_amplitude * add_diag( gp_exp_quad_cov( x , 1.0 , f00_lengthscale ), 1e-5) ;
	// latent vector
	row_vector[n_x] f00 = y_intercept + f00_stdnorm * cholesky_decompose( k00 )  ;
	// likelihood
	to_vector(y_mat) ~ normal( to_vector( rep_matrix( f00 , n_rep ) ) , noise ) ;
}

we can reduce compute during inference by removing the f00_stdnorm parameter and replacing the last few lines (starting at the commet // latent factor) with:

...
		// likelihoods
		y ~ multi_normal_cholesky(
			rep_row_vector( a_intercept , n_x )
			, cholesky_decompose( add_diag(  k00  , noise ) )
		) ;
}

What would I do in the generated quantities block to obtain samples of the latent function? Or is that even possible?

And for bonus points, if one has a branching-additive GP structure where the model with explicit inference on the latents is:

data{

	int n_x ;
	array[n_x] real x ; 
	int n_rep ;
	array[n_rep] row_vector[n_x] a ; // observations for a
	array[n_rep] row_vector[n_x] b ; // observations for b

}

transformed data{

	// variables used  in the version that achieves explicit inference on the latents
	matrix[n_rep,n_x] a_mat = to_matrix(a) ;
	matrix[n_rep,n_x] b_mat = to_matrix(b) ;

}

parameters{

	// noise: measurement noise
	real<lower=0> noise ;
	// lengthscales
	real<lower=0> f00_lengthscale ;
	real<lower=0> f1a_lengthscale ;
	real<lower=0> f1b_lengthscale ;
	// amplitudes
	real<lower=0> f00_amplitude ;
	real<lower=0> f1a_amplitude ;
	real<lower=0> f1b_amplitude ;
	// intercepts
	real a_intercept ;
	real b_intercept ;
	// stdnormals
	row_vector[ n_x ] f00_stdnorm ; // removed in the "integrated-out" version
	row_vector[ n_x ] f1a_stdnorm ; // removed in the "integrated-out" version
	row_vector[ n_x ] f1b_stdnorm ; // removed in the "integrated-out" version

}

model{

	//priors
	noise ~ weibull(2,1) ;
	f00_lengthscale ~ weibull(2,2) ;
	f1a_lengthscale ~ weibull(2,2) ;
	f1b_lengthscale ~ weibull(2,2) ;
	f00_amplitude ~ weibull(2,1) ;
	f1a_amplitude ~ weibull(2,1) ;
	f1b_amplitude ~ weibull(2,1) ;
	a_intercept ~ std_normal() ;
	b_intercept ~ std_normal() ;
	f00_stdnorm ~ std_normal() ;
	f1a_stdnorm ~ std_normal() ;
	f1b_stdnorm ~ std_normal() ;

	// covariances
	matrix[n_x,n_x] k00 = gp_exp_quad_cov( x , f00_amplitude , f00_lengthscale ) ;
	matrix[n_x,n_x] k1a = gp_exp_quad_cov( x , f1a_amplitude , f1a_lengthscale ) ;
	matrix[n_x,n_x] k1b = gp_exp_quad_cov( x , f1b_amplitude , f1b_lengthscale ) ;
	// latent vectors
	row_vector[n_x] f00 = f00_stdnorm * cholesky_decompose( k00 )  ;
	row_vector[n_x] f1a = f1a_stdnorm * cholesky_decompose( k1a ) ;
	row_vector[n_x] f1b = f1b_stdnorm * cholesky_decompose( k1b ) ;
	// g1_ from from f00 & f1_
	row_vector[n_x] g1a = a_intercept + f00 + f1a ;
	row_vector[n_x] g1b = b_intercept + f00 + f1b ;
	// likelihoods
	to_vector(a_mat) ~ normal( to_vector( rep_matrix( g1a , n_rep ) ) , noise ) ;
	to_vector(b_mat) ~ normal( to_vector( rep_matrix( g1b , n_rep ) ) , noise ) ;
}

and the version with the latents integrated out is:

...
	// likelihoods
	a ~ multi_normal_cholesky(
		rep_row_vector( a_intercept , n_x )
		, cholesky_decompose( add_diag(  k00 + k1a  , noise ) )
	) ;
	b ~ multi_normal_cholesky(
		rep_row_vector( a_intercept , n_x )
		, cholesky_decompose( add_diag(  k00 + k1a  , noise ) )
	) ;
}

Is it still possible to then get samples from each of the three latent functions in the GQ block? If so, how?

mike-lawrence · March 10, 2023, 12:26am

Ah, belatedly remembered seeing the answer to the simpler single-GP case in the Stan User Guide here.

So I guess that just leaves the bonus question from above: is it possible to achieve the same analytic inference in the case of additive GPs?

avehtari · March 20, 2023, 1:48pm

In case of additive GPs, just add the covariance matrices together and act as with one covariance matrix.

mike-lawrence · March 20, 2023, 11:01pm

Ah, I could have been clearer in my self-reply; I was looking for guidance on doing inference on f00 & f1a & f1b in the GQ block of a model where those latents have been integrated-out. The SUG chapter I linked shows how to infer a single integrated-out latent, but doesn’t cover the additive case.

mike-lawrence · March 20, 2023, 11:05pm

Oh, I bet the process is such that you infer the g1_ latents from their respective covariances and then work the model algebra backwards for the f__ latents.

avehtari · March 21, 2023, 5:27pm

Hmm… I feel like I’m missing some useful information, as I don’t understand the question. Can you define what are f00, f1a, f1b, g1_, and f__?

Topic		Replies	Views
Gaussian process predictive inference memory pressure Modeling	2	163	February 19, 2024
Likely-silly question on Gaussian Processes: can/should they be scale()'d? Modeling specification	1	757	February 10, 2022
Noiseless Hilbert Space Gaussian Process? Modeling specification	1	398	March 20, 2023
Variance model Modeling specification	48	1837	May 31, 2020
How does one do predictive inference with the sum of gaussian processes? Modeling gaussian-process	6	110	December 11, 2024

Gaussian processes: posterior inference on integrated-out latent(s)?

Related topics