Large model with skewed distributions

EllenIAH · October 10, 2019, 3:12pm

The following is a simplistic multilevel model to estimate effects of locations and products on transaction distributions. Results are good (using parameters to simulate and test actual transactions), but it takes a LONG time to run. Hoping for suggestions on how to improve approach and performance. Thanks:

data {
int N; // number of observations   
vector[N] y; // observations

int J; // number of groups- Prod
int PGrp[N]; // group indexes

int K; // number of grps - Location
int LGrp[N]; // Location level grp indexes

real p_Mu;
real p_sigma;
real p_alpha;

vector[J] p_PGrpMu;
vector<lower=0>[J] p_sigma_PGrp;
vector[J] p_alpha_PGrp;

vector[K] p_LGrpMu;
vector<lower=0>[K] p_sigma_LGrp;
vector[K] p_alpha_LGrp;

real  muSigma; 
real  sigmaSigma; 
real alphaSigma; 

real pGrpMuSigma; 
real pGrpSigmaSigma; 
real pGrpAlphaSigma; 

real lGrpMuSigma; 
real lGrpSigmaSigma; 
real lGrpAlphaSigma; 

}
parameters {

real mu;
real<lower=0> sigma;
real alpha;

vector[J] PGrpMu;
vector<lower=0>[J] sigma_PGrp;
vector[J] alpha_PGrp;

vector[K] LGrpMu;
vector<lower=0>[K] sigma_LGrp;
vector[K] alpha_LGrp;


}
model {

target += cauchy_lpdf(mu | p_Mu,  muSigma);
target += cauchy_lpdf(PGrpMu | p_PGrpMu, pGrpMuSigma);
target += cauchy_lpdf(LGrpMu | p_LGrpMu, lGrpMuSigma);

target += cauchy_lpdf(sigma | p_sigma, sigmaSigma);
target += cauchy_lpdf(sigma_PGrp | p_sigma_PGrp, pGrpSigmaSigma);
target += cauchy_lpdf(sigma_LGrp | p_sigma_LGrp, lGrpSigmaSigma);

target += cauchy_lpdf(alpha | p_alpha, alphaSigma);
target += cauchy_lpdf(alpha_PGrp | p_alpha_PGrp, pGrpAlphaSigma);
target += cauchy_lpdf(alpha_LGrp | p_alpha_LGrp, lGrpAlphaSigma);

target += skew_normal_lpdf(y | mu+PGrpMu[PGrp]+LGrpMu[LGrp], sigma+sigma_PGrp[PGrp]+sigma_LGrp[LGrp], alpha+alpha_PGrp[PGrp]+alpha_LGrp[LGrp] );

}

mitzimorris · October 10, 2019, 3:17pm

you are using target += which includes propotional constants.

e.g.:

target += cauchy_lpdf(mu | p_Mu,  muSigma);

the sampling statement drops the proportional constant:

mu ~ cauchy( p_Mu,  muSigma);

mcol · October 10, 2019, 3:26pm

If your model is well behaved (no divergences or other), you could try decreasing the adapt_delta, which will increase the step size and thus require fewer gradient evaluations. It could be risky as you could start seeing divergences that perhaps before you didn’t have.

Perhaps someone will be able to say if replacing those Cauchy priors with something else could lead to some speedups, I don’t have direct experience with that.

betanalpha · October 10, 2019, 7:33pm

For reasonable models decreasing adapt_delta below 0.8 should generally lower performance. The step size will be lower, and fewer gradients evaluated, but the trajectories will be less accurate and more poorly explore the relevant parts of the space.

Cauchys are almost always a bad idea, especially in priors. There may be poor identifiability between the additive components, which could be resolved by more informative priors.

emiruz · October 11, 2019, 10:11am

Hey @EllenIAH, the obvious suggestion here is to parallelise your model using the map_rect function function. There is a minimalist tutorial here.

It’ll essentially make parallel the log probability evaluation and so drastically decrease the clock time your model takes to run.

Topic		Replies	Views
Help with hierarchical model Modeling	2	584	February 11, 2018
Data Structure for 2-level multilevel VAR Modeling	2	475	July 7, 2020
Cross-classified Hierarchical Model, Heterogenous Variances Modeling	19	1463	August 11, 2018
My first (hierarchical) model, and it's not working Modeling fitting-issues	10	1027	July 30, 2018
Modeling a GJR GARCH with skewed innovations Modeling	6	1321	March 30, 2018

Large model with skewed distributions

Related Topics