Dear all,

I generated a hierarchical model data X by using some parameter true \theta^* then I fitted pystan on a hierarchical model using the generated data X.

I initialized the starting param (\theta_{stan}) to be the true parameter \theta^*, but the pystan model is having hard time in reaching the convergence, after I increase maximum_tree_depth (to 20)and adapt_detla(to 0.99) I had the parameter converging to some wrong value \theta_{stan}.

When I calculate the log_prob by using the fit.log_prob() function, I found the true param \theta^* indeeds have higher log prob but still the model converges to a way lower log prob param \theta_{stan}

Isn’t that wired that the stan is trying so hard (with 0.99 adapt_delta, 20 maximum tree depth) to reach a set of param \theta_{stan} that contains way lower log_prob? I understand from the previous post that those two sets of param \theta_{stan} and \theta^* should not be identical but close, however, now I am having two sets of param that are significantly different in terms of their log_prob evaluated.

It’s probably first worth figuring out why this model isn’t sampling so well.

Regarding the lp thing, can you run the optimizer? The MLE estimate of theta is probably different from the theta used to generate the data too. If this is a hierarchical model, it’s likely the MLE doesn’t exist (it’ll be infinity and the optimization will blow up). In that case I wouldn’t worry about there being points of higher density – cause then there’d be points of anywhere up to infinite density.