Mystery: chains mixing well when using average, but not when using single observations?

Hi there,

If you could please help me to solve a little mistery. We used to have a code to fit a ODE model on average observed values. This worked well. Then we tried to use the actual time series. Here it doesnt work. I was wondering if something was wrong in the individual data, but in reality even if I duplicate a number of time the average data, it still doesnt work…

To extend the model to multiple observations, I had to modify as follows (I am only adding key parts):

For average based model:

data{
int<lower = 0> Ntimes; // num timepoints
int<lower = 0> Nd; // num doses

real<lower = 0> y[Ntimes,Nd];
real<lower = 0> sigma;
…}

parameters {
real<lower = 0> theta[2];
}
transformed parameters{ …
real O[Ntimes,Nd];

O = function(theta);
}
model {
theta[1]~lognormal(-11.5,2);
theta[2]~lognormal(-9.2,2);

         for (k in 1:Nd) {
             target +=    normal_lpdf(y[ , k] | O[,k], sigma);
              }

The above works well. Typically I get pairs like this:

Now, if try yo modfy for instance as follows (I’ve tried many formats for the new y data):

data{
int<lower = 0> Ntimes; // num timepoints unchanged
int<lower = 0> Nd; // num doses unchanged

vector<lower = 0>[Nobs] y[Ntimes,Nd]; //each time and dose combination now contains a vector
// with all observations
real<lower = 0> sigma; // unchanged
…}

parameters {
real<lower = 0> theta[2]; // unchanged
}
transformed parameters{ …
real O[Ntimes,Nd]; // unchanged

O = function(theta); // unchanged
}
model {
theta[1]~lognormal(-11.5,2); // unchanged
theta[2]~lognormal(-9.2,2); // unchanged

       for (k in 1:Nd) {
          for (i in 1:Ntimes) {
           target +=    normal_lpdf(y[i,k] | O[i,k], sigma);
                                     }
                               }

}

For which I get something like that…

What’ going on?? Many thanks in advance ;-)

Gianni

Just wondering if anyone can enlighten us - I havent found a solution yet…

So, based on the scatterplot, there are some multimodality issues, right?

How is your data? How far from the mean are the individual observations?

Edit.
I think you should do posterior predictive checks.

Hi ahartikainen,

Thans for your reply. The coefficient of variation from the data is less than 0.1 - so its pretty close to the average.

The mean parameters are order of magnitudes far from what I got when running with the average (parameters are quite tight there). And the corresponding solution from these parameters is not even close to the data. I dont think youll get more insight from a full posterior check given the level of discrepancy.

My feeling is that for some reason when jumbing into 3 dimensional arrays there is an issue that arise and the posterior does not reflect anymore the solution.

Is it possible anything is going on at the sampling level with the 3d arrays?

If this is in the first model

for (k in 1:Nd) {
    target +=    normal_lpdf(y[ , k] | O[,k], sigma);
}

then shouldn’t the loops in the second case be

for (k in 1:Nd) {
    for (i in 1:Nobs) {
        target +=    normal_lpdf(y[,k,i] | O[,k], sigma);
    }
}

I would still check prior_predictive and posterior_predictive results.