Non-centered parameterization for basic spline

Hi there,

I am quite new to stan and trying to fit a time course with 60 samples by the basic spline. Since I have the measurement error of each data point, I was planning to embed measurement error in the model so that the posterior of the weight of each knot will consider the uncertainty due to measurement error. The following model #1 worked but had some divergent transitions. Then I tried to make a noncentered one as model #2 in which I just reparametrized normal distribution by mu + z*sigma. This model went through like 50 times fasters than the previous one without any divergent transition, but it seems did not fit the data so that the weights were just zeros with large std around 10 (e.g. similar to the prior of weights). Could you please help to debug? Thanks a lot.

model #1

data{
  int num_knots;
  vector[60] timecourse_se;
  vector[60] timecourse_obs;
  matrix[60,8] B;
}
parameters{
  vector[60] timecourse_true;
  vector[num_knots] w;
  real<lower=0> sigma;
}
model{
  vector[60] timecourse_bar;
  sigma ~ exponential( 1 );
  w ~ normal( 0 , 10 );
  for ( i in 1:60 ) {
    timecourse_bar[i] = B[i] * w;
  }
  timecourse_true ~ normal( timecourse_bar , sigma );
  timecourse_obs ~ normal( timecourse_true , timecourse_se );
}

Model #2

data{
  int num_knots;
  vector[60] timecourse_se;
  vector[60] timecourse_obs;
  matrix[60,8] B;
}
parameters{
  vector[num_knots] w;
  real zbar;
  real z;
  real<lower=0> sigma;
}
model{
  vector[60] timecourse_obs;
  vector[60] timecourse_true;
  vector[60] timecourse_bar;
  sigma ~ exponential( 1 );
  z ~ normal( 0 , 1 );
  zbar ~ normal( 0 , 1 );
  w ~ normal( 0 , 10 );
  for ( i in 1:60 ) {
    timecourse_bar[i] = B[i] * w;
  }
  for ( i in 1:60 ) {
    timecourse_true[i] = timecourse_bar[i] + zbar * sigma;
  }
  for ( i in 1:60 ) {
    timecourse_obs[i] = timecourse_true[i] + z * timecourse_se[i];
  }
}

I fear you’ve misunderstood what non-centered parametrization is. Non-centered parametrization for a vector of variables x really is just moving from:

parameters {
  vector[N] x;
  real<lower=0> sigma;
}

model {
  x ~ normal(0, sigma);
}

to

parameters {
  vector[N]  x_raw;
  real<lower=0> sigma;
}

transformed parameters {
  vector[N] x = x_raw * sigma;
}

model {
  x_raw ~ normal(0,1);
}

Your model #2 however has a different number of parameters than model #1 and I don’t really understand what is going on in there. The direct reason of your problem is that the model #2 has no likelihood, i.e. the observed data in timecourse_obs is never tied to the parameters.

You redefine timecourse_obs from the data block in the model block which is weird (and the compiler not throwing an error on this is probably a bug).

The problem with model #1 is IMHO that in:

  timecourse_true ~ normal( timecourse_bar , sigma );
  timecourse_obs ~ normal( timecourse_true , timecourse_se );

The timecourse_true parameters are not well behaved. Those can be eliminated - for all inferences, the following should be exactly equivalent.

  real total_sigma = sqrt(sigma ^ 2 + timecourse_se ^ 2);
  timecourse_obs ~ normal( timecourse_bar , total_sigma );

Overall, the model you describe should be within the capabilities of brms, you may find brms easier to
use than Stan. Although if your goal is learning Stan, the writing your own code is definitely good :-)

Hope that helps!

Thanks a lot for the helpful explanation. Indeed I redefine data in model which is weird :) The total sigma trick works, really nice to know it. Yes I’d like to force myself to write the stan code so that I could learn little by little. Using brms might be more efficient, I guess writing in stan let me think and understand fundamentals that I though I knew but not really. Thanks again.

General comment: if you treat knots as fixed, the model uncertainty is still very great when N is small. In other words, the posterior distribution of predicted values will be wide. I’m not sure if it’s worth the trouble of incorporating further sources of uncertainty, since parameter estimation error will likely dominate.

Thanks for the comment. I was trying penalized spline these days in which may deal with this arbitrary knots issue.

What’s more important than the location of knots is their number, when the spline is smooth.