Why are these two postierors so different, ?floating point precision, ?scaling issue

specification

#1

Hi

I’m trying to understand why with the following model cumulative_R has a much narrower posterior than cumulative_ncp, given that I would expect them to be identical.

In this model there are two rates, things that vary at rate N and things that vary at rate day;
cumulative_ncp is the sum of the variable delta varying at N.
while cumulative_R is the sum of delta (as in this cut-down version of the model delta = R) varying at day. both are then scaled to the same units.

In my test case where the sum of delta x 86400 is known to be (-)700.
The final value of cumulative_R is 697.7598-701.9972 (5-95% quantile range) while
cumulative_ncp is -689.3504-704.6502.

data {
  int<lower=0> N;
  int<lower=0> N_day;
  int<lower=0> day[N];
  real C[N];
  real kw[N];
  real B[N];
  real hm[N];
  real dhdt[N];
  real Csat[N];
  real dt;
}

parameters {
  real<lower=0> sigma;
  real<lower=0> Cb[N_day];
  real<lower=0> R[N_day];
}

transformed parameters {
  real C_hat[N-1];
  real delta[N-1];

  for(n in 1:N-1){
    delta[n] = 0 - R[day[n]];
    C_hat[n] = (exp(-(dt*(kw[n] + dhdt[n]))/hm[n]) * (dhdt[n] * (Cb[day[n]] * (exp((dt*(kw[n] + dhdt[n]))/hm[n]) - 1) + C[n]) + (delta[n]) * (exp((dt * (kw[n] + dhdt[n]))/hm[n]) - 1) + kw[n] * ( (Csat[n] * ((B[n]) + 1)) * (exp((dt * (kw[n] + dhdt[n]))/hm[n]) - 1) + C[n])))/(kw[n] + dhdt[n]);
  }
}

model {
 sigma ~ normal(0, 1);
 R ~ normal(0, 1);
 Cb ~ normal(200, 10);

 for(n in 2:N){
   C[n] ~ normal(C_hat[n-1], sigma);
 }
}

generated quantities{
  vector[N-1] cumulative_ncp;
  vector[N_day] cumulative_R;

  for(n in 1:N-1){
    cumulative_ncp[n] = (delta[day[n]]); // mmol m-2 s-1
  }
  cumulative_ncp = cumulative_sum(ncp * dt); // mmol m-2 day-1

  for(d in 1:N_day){
    cumulative_R[d] = (R[d]) * 86400; // mmol m-2 day-1
  }
  cumulative_R = cumulative_sum(cumulative_R);
}

#2

I’m not sure I understand which quantities in the posterior you expect to be the same or why. You have two vector quantities of different lengths, cumulative_ncp and cumulative_R, but you seem to be reporting a single interval. Are you comparing the final entries cumulative_R[N_day] and cumulative_ncp[N - 1]?


#3

Yes, sorry, just the final values, I should have used sum rather than cumulative sum in this example.


#4

That’s much more error than you’d expect from just floating point precision. Are you usre those two quantities you’re computing will be exactly the same?


#5

It should be if I’d coded it correctly. But I haven’t and writing out this explanation has revealed the mistake I’d made.

N has length 336, N_day is 14 and dt is 3600.
There are 24 N for each day

so

for( n in 1:N-1){
    delta[n] = 0 - R[day[n]];
}

should be assigning the same value to delta for each 24 N. i.e. R is constant over a day.

So in the generated quantities block is where I’ve made the mistake:

for(n in 1:N-1{
    ncp[n] = delta[day[n]];
}
cumulative_ncp = cumulative_sum(ncp * dt);

This should assign delta for each n to a new variable, which then gets scaled.

Simply put, it should be:

ncp[n] = delta[n];

Thanks for your time Bob, marking this as solved.