Do you have a smallish model that shows this behavior? I can look on Monday but should easier to debug with a test that shows the bad behavior

Using @mhollanders’s (big) model is the first one with which I see the problem. I don’t rememeber seeing non-finite gradients except far in tails of posterior, but in this case the non-finite gradients are happening towards the higher density. @mhollanders do you have a smaller model + data that breaks Pathfinder?

Hey Aki, I just had a go at simulation just a basic regression with several nested random effects but it works well, so I’m afraid I don’t have an example. All I know is that in my model, it happens with all configurations (SEM vs. MVN) once the site-level random effects are added.

It also seems that Pathfinder failing depends on the data, so that sometimes when I restart R, and rerun data generation, I do not get non-finite gradients.

This example did anyway help to fix two things in the initialization function to make it more robust (one of these is fic for NaN’s and Infs in `lp__`

). I would guess the fix would be included in the `cmdstanr`

point release on next Monday

This doesn’t work for me and instead returns an error:

```
Error in sample.int(length(x), size, replace, prob) :
NA in probability vector
```

@mhollanders and @JLC , can you try with the github version

```
remotes::install_github("stan-dev/cmdstanr")
```

If you do this in a running session, you need to unload and load `cmdstanr`

and call `cmdstan_model()`

again (or just restart R and run your script again). This fix should remove couple errors, but if all your draws have non-finite gradient then the initialization still doesn’t work,

@JLC can you tell more about your model and data, and even better if you can share them so that I can test myself.

Hey Aki, just confirming that I’m still getting `lp__ = Inf`

, but I can use the Pathfinder object as initial values for HMC.

Thanks @avehtari

The github version gives me a different problem, which might be due to me being new to Pathfinder. When running Pathfinder I do see a lot of:

```
Error evaluating model log probability: Non-finite gradient.
```

When trying to use the pathfinder result for init, I’m get:

```
Chain 3 Exception: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=peak_z; dims declared=(1,175); dims found=(175)
```

My data list declares an array [175] for peak_z.

If I don’t use the pathfinder object for the init, everything samples fine. I’m not sure where the 1,175 declaration is originating.

My model is a piecewise regression of viral trajectory:

```
functions {
// mufun returns mean Ct:
real mufun(real t, real peak, real peaktime, real pslope, real cslope, real lod){
// Proliferation
if(t <= peaktime)
return(peak + pslope * t);
// Clearance
else
return(peak + (pslope * peaktime + cslope * (t - peaktime)));
}
// durfun returns stage (prolieration and clearance) durations:
real durfun(real t, real peak, real peaktime, real pslope, real cslope, real lod){
// Proliferation
if(t <= peaktime)
return(abs((lod - peak)/pslope));
// Clearance
else
return(abs((lod - peak - pslope * peaktime + cslope * peaktime)/cslope));
}
// stagefun returns stage (prolieration and clearance) indicators:
real stagefun(real t, real peaktime){
if(t <= peaktime)
return(1); // Proliferation
else
return(2); // Clearance
}
}
data {
int<lower=1> N; // total number of observations
vector[N] Y; // observed Ct
real<lower=0> lod; // limit of detection
array[N] real lb; // upper truncation bounds
array[N] real ub; // upper truncation bounds
int<lower=1> K_pslope; // number of proliferation slope predictors
matrix[N, K_pslope] X_pslope; // proliferation slope design matrix
int<lower=1> K_cslope; // number of clearance slope predictors
matrix[N, K_cslope] X_cslope; // clearance slope design matrix
int<lower=1> K_peak; // number of peak predictors
matrix[N, K_peak] X_peak; // peak design matrix
int<lower=1> K_peaktime; // number of peak time predictors
matrix[N, K_peaktime] X_peaktime; // peak time design matrix
array[N] int t; // time
int<lower=1> nSubj; // number of subjects
array[N] int<lower=1> Subj; // subject indicator
}
parameters {
vector[K_pslope] b_pslope; // proliferation slope
vector[K_cslope] b_cslope; // clearance slope
vector[K_peak] b_peak; // peak Ct
vector[K_peaktime] b_peaktime; // time of peak Ct
real<lower=0> sigma; // variance
vector<lower=0>[1] peak_sd; // subject-level peak Ct varfiance
array[1] vector[nSubj] peak_z; // individual peak Ct effects
vector<lower=0>[1] peaktime_sd; // subject-level peak Ct time variance
array[1] vector[nSubj] peaktime_z; // individual peak Ct time effects
}
transformed parameters {
vector[nSubj] subj_peak; // subject-specific peak Ct offsets
vector[nSubj] subj_peaktime; // subject-specific peak Ct time offsets
subj_peak = peak_sd[1] * peak_z[1];
subj_peaktime = peaktime_sd[1] * peaktime_z[1];
vector[N] pslope = X_pslope * b_pslope;
vector[N] cslope = X_cslope * b_cslope;
vector[N] peak = X_peak * b_peak + subj_peak[Subj];
vector[N] peaktime = X_peaktime * b_peaktime + subj_peaktime[Subj];
vector[N] mu;
// Calculate location (mu)
for (i in 1 : N) {
mu[i] = mufun(t[i], peak[i], peaktime[i], pslope[i], cslope[i], lod);
}
}
model {
real lprior = 0;
// likelihood
target += normal_lupdf(Y | mu, sigma)
- normal_lcdf(ub | mu, sigma);
// priors
lprior += normal_lupdf(b_pslope | 0, 1);
lprior += normal_lupdf(b_cslope | 0, 1);
lprior += normal_lupdf(b_peak[1] | 18, 10);
lprior += normal_lupdf(b_peak[2] | 18, 10);
lprior += normal_lupdf(b_peak[3] | 18, 10);
lprior += normal_lupdf(b_peak[4:] | 0, 5);
lprior += normal_lupdf(b_peaktime | 0, 0.25);
lprior += normal_lupdf(sigma | 0, 4);
lprior += normal_lupdf(peak_sd | 0, 10);
lprior += normal_lupdf(peaktime_sd | 0, 3);
target += lprior;
target += std_normal_lupdf(peak_z[1]);
target += std_normal_lupdf(peaktime_z[1]);
}
generated quantities {
vector[N] log_lik;
vector[N] ct_pred;
vector[N] Duration;
vector[N] Stage;
for (i in 1 : N) {
//Log likelihood
log_lik[i] = normal_lpdf(Y[i] | mu[i], sigma)
- normal_lcdf(ub[i] | mu[i], sigma);
// Predicted Ct's
// ct_pred[i] = normal_lub_rng(mu[i], sigma, 0.0, 37.0);
// Calculate durations by row
Duration[i] = durfun(t[i], peak[i], peaktime[i], pslope[i], cslope[i], lod);
// Create stage indicator per row
Stage[i] = stagefun(t[i],peaktime[i]);
}
}
```

Test with the newest development version. I had the same error for a model generated with brms, but it works now. This is the fix: Fixes 975 by only removing leftmost array dimension if equal to 1 by SteveBronder · Pull Request #993 · stan-dev/cmdstanr · GitHub