Not able to derive simulation data parameters

bhomass · August 1, 2017, 6:29pm

I generated simulation data using preset parameters which I attempt to recover using the solver, but they numbers are way off. The summary shows

             Mean     MCSE   StdDev        5%    50%    95%  N_Eff  N_Eff/s    R_hat

lp__ -1619 2.8e-01 5.5e+00 -1.6e+03 -1619 -1610 395 8.2e-02 1.0e+00
accept_stat__ 0.85 4.5e-03 1.4e-01 5.4e-01 0.89 1.0 1000 2.1e-01 1.0e+00
stepsize__ 0.44 3.1e-15 2.2e-15 4.4e-01 0.44 0.44 0.50 1.0e-04 1.0e+00
treedepth__ 3.0 6.7e-03 1.9e-01 3.0e+00 3.0 3.0 824 1.7e-01 1.0e+00
n_leapfrog__ 8.1 1.5e-01 3.9e+00 7.0e+00 7.0 15 707 1.5e-01 1.0e+00
divergent__ 0.00 0.0e+00 0.0e+00 0.0e+00 0.00 0.00 1000 2.1e-01 -nan
energy__ 1642 4.0e-01 7.3e+00 1.6e+03 1641 1654 326 6.8e-02 1.0e+00

Is there anything abnormal about this dump? like that lp___ being a large negative number?

Bob_Carpenter · August 6, 2017, 9:16pm

No, the lp__ is the log density of the posterior up to an additive constant.

bhomass · August 7, 2017, 6:43am

Thanks. On this matter, I actually got a good model running. It’s just taking a hell of a long time to run thru (it will be over two days the way its going). As I understand, there is no way around that by parallelization, at least not easily. If anyone has any less than obvious solution to efficiently train a complex model, please suggest.

Bob_Carpenter · August 7, 2017, 10:56pm

What n_eff are you targeting that’s taking that long?

The only thing to do is either make the program more efficient or reparameterize so it mixes better.

bhomass · August 10, 2017, 6:53am

I am trying to get the basic idea of reparametrization. It appears that what it does is picking some prior distribution to bound a parameter which is otherwise unbounded. However, if I already declare lower and upper bounds on my parameters, then there is really not much I can improve by means of reparameterization.

Bob_Carpenter · August 11, 2017, 11:05pm

There’s a chapter in the manual that explains what’s going on. It doesn’t have to do with bounding—everything happens on the unconstrained scale with Stan’s samplers.

bhomass · August 13, 2017, 7:43pm

I read that chapter. Take the example of Beta Distribution, what it did was convert
theta[n] ~ beta(alpha, beta);
to
theta ~ beta(lambda * phi, lambda * (1 - phi));

where as alpha and beta only has lower bounds, phi has an added upper bound. That is the only place I can see the reparameterization improving over the original. I am not sure how much difference that really makes, but if it did help, it seems to be due to the added upper bound. However, you are saying it does not have to do with bounding, so I don’t know why should the reparameterized version performs better, or how to apply the principle to other examples.

The model also declared that phi and lambda came from beta and pareto distributions, whereas alpha and beta came from some undeclared distributions. If it is the distributions that matters, than the example is not making that point by hiding alpha and beta’s distribution, and I have no intuition why the prior distribution can make a huge difference in computational speed.

Bob_Carpenter · August 14, 2017, 12:29pm

That is a reparameterization of the beta distribution in terms of mean and total count (a kind of precision term). It needs new priors, though, so it’s not a pure reparameterization in the sense that your results will be different.

The main things that reparameterization helps with is turning funnel-like posteriors into more isotropic ones.

Topic		Replies	Views
Reparameterization ("Matt trick") failing to produce a speed improvement in hierarchical model Modeling techniques , cognitive-science	2	1346	June 12, 2018
Initialization fails after reparameterization Modeling	2	465	November 14, 2019
Assessing speed of multilevel model Modeling	1	427	March 25, 2020
Computational Efficiency: advanced techniques? Modeling	1	416	October 4, 2022
Sampling issues in my model: Reparamterization does not seem to work Modeling	3	728	March 23, 2021

Not able to derive simulation data parameters

Related topics