PC prior for P-Splines (in python)

I’m trying to implement P-splines with PC prior in stan and results are less than stellar.
In particular:

The E-BFMI, 0.00, is below the nominal threshold of 0.30 which suggests that HMC may have trouble exploring the target distribution.

Also a lot of parameters have Rhat>1.05 and fewer than 0.001 effective draws per transition.
So I’m obviously doing something bad.

data {
  int<lower=0> N;   // number of data items
  int<lower=0> K;   // number of splines
  int<lower=0> L;   // number of simulated samples
  matrix[N, K] B;   // predictor matrix
  matrix[K, K] R;   // structure matrix

  vector[N] y;      // outcome vector
  real U; // upper bound on marginal std
  real alpha; //probability of marginal std exceeding U
transformed data {
  real theta = -log(alpha)/U;
  vector[L] ones_L = rep_vector(1, L);
  vector[K] zeros_K = zeros_vector(K);
  matrix[L,K] sub_B = block(B,1,1,L,K); 
parameters {
  vector[K] beta;      // coefficients
  real<lower=0> sigma;  // std
  real<lower=0> tau_beta;  // beta precision
model {
  y ~ normal(B * beta, sigma);  // likelihood
  beta ~ multi_normal_prec(zeros_K,tau_beta*R);
  tau_beta ~ gumbel(0.5,theta); //pc prior on precision of beta
  sigma ~ exponential(.1);

generated quantities {
array[L] real y_pred; //predictions
y_pred = normal_rng(sub_B*beta,ones_L*sigma);


I specified parameters as:

nm1= BN.shape[1]-1
R = D.T@D

data_current = dict(N=BN.shape[0],

BN is a stacked matrix of predictions for multiple signals (10).

I’m using PC-priors description from: [1511.05748] Penalized complexity priors for degrees of freedom in Bayesian P-splines

I’d be greatful for suggestion what I am doing wrong.

Can you get this to fit with a simpler prior on beta, say just a standard normal? Then you can try moving to a fixed multi-normal without tau_beta varying. You can set that to a known value derived during simulation, just to help debug the model.

The reason I’m focusing on the multi_normal_prec is that it’s the hierarchical prior, which is where a lot of computational problems arise. Also, Stan isn’t particularly well set up for multi-normals involving precision because we don’t have a direct Cholesky-factor implementation. So the solve there could be problematic numerically depending on what tau_beta and R are.

Although not related to fitting problems, I would suggest a <lower=0> constraint on U and a <lower=0, upper=1> on alpha if that really is meant to be a probability. Also, I don’t think U is truly an upper bound—if you want it to be, then you need to declare sigma with constraint <lower=0, upper=U>.

For efficiency you should be able to drop the definition of ones_L and the multiply in the normal_rng. Stan should broadcast given that sub_B has L rows. Also, you can define the generated quantities as a one-liner as you’ve done with transformed data.

1 Like

For simple normal on beta it works perfect, but I wanted to be fancy :) Thanks, I’ll go over all your sugestions.