Why are these two postierors so different, ?floating point precision, ?scaling issue


I’m trying to understand why with the following model cumulative_R has a much narrower posterior than cumulative_ncp, given that I would expect them to be identical.

In this model there are two rates, things that vary at rate N and things that vary at rate day;
cumulative_ncp is the sum of the variable delta varying at N.
while cumulative_R is the sum of delta (as in this cut-down version of the model delta = R) varying at day. both are then scaled to the same units.

In my test case where the sum of delta x 86400 is known to be (-)700.
The final value of cumulative_R is 697.7598-701.9972 (5-95% quantile range) while
cumulative_ncp is -689.3504-704.6502.

data {
  int<lower=0> N;
  int<lower=0> N_day;
  int<lower=0> day[N];
  real C[N];
  real kw[N];
  real B[N];
  real hm[N];
  real dhdt[N];
  real Csat[N];
  real dt;

parameters {
  real<lower=0> sigma;
  real<lower=0> Cb[N_day];
  real<lower=0> R[N_day];

transformed parameters {
  real C_hat[N-1];
  real delta[N-1];

  for(n in 1:N-1){
    delta[n] = 0 - R[day[n]];
    C_hat[n] = (exp(-(dt*(kw[n] + dhdt[n]))/hm[n]) * (dhdt[n] * (Cb[day[n]] * (exp((dt*(kw[n] + dhdt[n]))/hm[n]) - 1) + C[n]) + (delta[n]) * (exp((dt * (kw[n] + dhdt[n]))/hm[n]) - 1) + kw[n] * ( (Csat[n] * ((B[n]) + 1)) * (exp((dt * (kw[n] + dhdt[n]))/hm[n]) - 1) + C[n])))/(kw[n] + dhdt[n]);

model {
 sigma ~ normal(0, 1);
 R ~ normal(0, 1);
 Cb ~ normal(200, 10);

 for(n in 2:N){
   C[n] ~ normal(C_hat[n-1], sigma);

generated quantities{
  vector[N-1] cumulative_ncp;
  vector[N_day] cumulative_R;

  for(n in 1:N-1){
    cumulative_ncp[n] = (delta[day[n]]); // mmol m-2 s-1
  cumulative_ncp = cumulative_sum(ncp * dt); // mmol m-2 day-1

  for(d in 1:N_day){
    cumulative_R[d] = (R[d]) * 86400; // mmol m-2 day-1
  cumulative_R = cumulative_sum(cumulative_R);

I’m not sure I understand which quantities in the posterior you expect to be the same or why. You have two vector quantities of different lengths, cumulative_ncp and cumulative_R, but you seem to be reporting a single interval. Are you comparing the final entries cumulative_R[N_day] and cumulative_ncp[N - 1]?

Yes, sorry, just the final values, I should have used sum rather than cumulative sum in this example.

That’s much more error than you’d expect from just floating point precision. Are you usre those two quantities you’re computing will be exactly the same?

It should be if I’d coded it correctly. But I haven’t and writing out this explanation has revealed the mistake I’d made.

N has length 336, N_day is 14 and dt is 3600.
There are 24 N for each day


for( n in 1:N-1){
    delta[n] = 0 - R[day[n]];

should be assigning the same value to delta for each 24 N. i.e. R is constant over a day.

So in the generated quantities block is where I’ve made the mistake:

for(n in 1:N-1{
    ncp[n] = delta[day[n]];
cumulative_ncp = cumulative_sum(ncp * dt);

This should assign delta for each n to a new variable, which then gets scaled.

Simply put, it should be:

ncp[n] = delta[n];

Thanks for your time Bob, marking this as solved.