Checking the correctness of marginal likelihood calculation

david758 · March 7, 2025, 1:59am

Hi,

I’m trying to fit a simple Bayesian regression model with NIG prior and compute the marginal likelihood using bridge sampling. My issue is that the marginal likelihood computed via bridge sampling is no where near the analytical solution. Any idea what is going wrong?

data {
  int<lower=1> n;   // number of observations
  int<lower=1> p;   // number of predictors
  matrix[n, p] X;   // design matrix
  vector[n] y;      // outcome vector
  vector[p] mu0;    // prior mean for beta
  matrix[p, p] Sigma0; // prior covariance for beta
  real<lower=0> a0;   // prior shape for sigma^2
  real<lower=0> b0;   // prior scale for sigma^2
}

parameters {
  vector[p] beta;    // regression coefficients
  real<lower=0> sigma2; // error variance
}

model {
  // Priors
  beta ~ multi_normal(mu0, sigma2 * Sigma0); // NIG prior on beta
  sigma2 ~ inv_gamma(a0, b0); // Inverse-Gamma prior on sigma^2

  // Likelihood
  y ~ normal(X * beta, sqrt(sigma2));
}

Here is my R code for computing the marginal likelihood analytically. I’m using the solution from Bayesian linear regression - Wikipedia

marginal_likelihood_nig <- function(y, X, mu0, Sigma0, a0, b0) {
  n <- nrow(X)
  p <- ncol(X)
  
  # Calculate terms
  betahat <- solve(t(X) %*% X) %*% t(X) %*% y 
  Sigma1 <- t(X) %*% X + Sigma0
  mu1 <- solve(Sigma1) %*% (t(X) %*% X %*% betahat + Sigma0 %*% mu0)
  a1 = a0 + n/2
  b1 = b0+0.5 * (t(y) %*% y + t(mu0) %*% Sigma0 %*% mu0 - t(mu1) %*% Sigma1 %*% mu1)
  
  # Marginal likelihood
  ml <- 1/(2 * pi)^(n/2) * sqrt(det(Sigma0)/det(Sigma1)) * b0^a0 / b1^a1 * gamma(a1) / gamma(a0)
  return(ml)
}

I run the following code to compare the outputs:

mu0 <- c(0,0)
Sigma0 <- diag(2)
a0=1
b0=1
stanfit <- sampling(blrnig,list(n=81,p=2,X=x,y=y,mu0=mu0,Sigma0=Sigma0,a0=a0,b0=b0))
bridge_sampler(stanfit)
log(marginal_likelihood_nig(y,x,mu0,sigma0,a0,b0))

The output looks like this:

> bridge_sampler(stanfit)
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Bridge sampling estimate of the log marginal likelihood: -4.91575
Estimate obtained in 4 iteration(s) via method "normal".
> log(marginal_likelihood_nig(y,x,mu0,sigma0,a0,b0))
          [,1]
[1,] -160.4551

I’ve attached a csv with y and x to reproduce this result.
test_data.txt (1.7 KB)

EDIT: I’ve also tried numerically integrating out beta and sigma2.

library(cubature)

nig_density <- function(y, X, beta, sigma2, mu0, Sigma0, a0, b0) {
  n <- nrow(X)
  p <- ncol(X)
  
  # Likelihood
  likelihood <- (2 * pi * sigma2)^(-n / 2) * exp(-1 / (2 * sigma2) * t(y - X %*% beta) %*% (y - X %*% beta))
  
  # Prior on beta (multivariate normal) - corrected
  prior_beta <- (2 * pi)^(-p / 2) * det(sigma2 * Sigma0)^(-1 / 2) * exp(-1 / (2) * t(beta - mu0) %*% solve(sigma2 * Sigma0) %*% (beta - mu0)) 
  
  # Prior on sigma^2 (inverse-gamma)
  prior_sigma2 <- b0^a0 / gamma(a0) * sigma2^(-a0 - 1) * exp(-b0 / sigma2)
  
  # Density
  density <- likelihood * prior_beta * prior_sigma2
  return(density)
}
result <- adaptIntegrate(function(b){nig_density(y,x,b[c(1,2)],b[3],mu0,Sigma0,a0,b0)}, lowerLimit = c(-Inf,-Inf,0), upperLimit = c(Inf,Inf,Inf))
log(result$integral)
> log(result$integral)
[1] -141.0701

This result is closer to the analytical solution, possibly off due to numeric precision and approximating the integral.

Thanks!

andrjohns · March 7, 2025, 9:49am

If you’re interested in the marginal likelihoods then you need to include the normalising constants for the distribution statements

You can do this by using the _lpdf syntax, where:

y ~ normal(mu, sigma);

Becomes:

target += normal_lpdf(mu, sigma);

Topic		Replies	Views
Errors when using bridgesampling General	4	857	July 14, 2020
Implicit uniform prior for computing the log marginal likelihood Modeling rstan , specification	8	566	April 19, 2023
Bayes factor estimation seems a long way off Modeling	3	696	May 1, 2022
Marginal likelihood / model comparison in pystan PyStan marginal-likelihood	1	1192	December 2, 2018
Errors in bridge sampling: invoking 'abort' restart when a large stan fit obejct is fed General rstan , fitting-issues , performance	11	6007	November 24, 2020

Checking the correctness of marginal likelihood calculation

Related topics