BRMS running time

onoufrios · June 8, 2018, 12:14pm

Dear all,
I am trying to predict the sales of different SKU by using a log-log model. However, I should also capture the heterogeneity among the different brands (I would like to have a different price elasticity coefficient for each different brand). When I run the model without random effect term it takes less than 10 minutes. After including the random effect term it takes more than 8-9 hours to finish. The problem is that I have to include more (random effect) terms so I cannot imagine how worse it will be. Am I doing something wrong? Here is the formula I used:

brm(formula = log(npack) ~ (log1p(sales_price_pack)|brand) + log1p(wd) + log1p(wd_disp),cores=parallel::detectCores() , chains = 2, warmup = 500, data = temp)

Should I try to define a more restrictive prior for the price since we know that the coefficient has to be negative?

bgoodri · June 8, 2018, 1:26pm

I doubt the model is putting positive mass on price elasticities that are positive. My guess is that the model just does not fit that well and has to use a very small stepsize to get around the posterior distribution you have defined.

onoufrios · June 8, 2018, 4:10pm

I dont think so that the model is not appropriate. It’s a very basic log-log model for predicting sales. Any other ideas? @paul.buerkner

paul.buerkner · June 8, 2018, 4:17pm

your model is missing a population level (fixed) effect of log1p(sales_price_pack). as a rule of thump, every group level effect should have a corresponding population level effect.

also please use the brms tag for brms related questions. otherwise I may overlook them sometimes.

onoufrios · June 9, 2018, 10:13am

Thanks a lot Paul. It seems that i runs much faster now. Is it normal to take approximately two and a half hours just for one random effect term?

paul.buerkner · June 9, 2018, 10:30am

There is no general rule. It depends on the size of your data, the number of levels of your grouping factor and the overall model complexity (and the fit of model to data).

onoufrios · June 9, 2018, 11:09am

I see. Stupid question : is it necessary to transform the group variable into factor?

paul.buerkner · June 9, 2018, 11:53am

you don’t need to transform it.

onoufrios · June 12, 2018, 7:57am

Sorry for asking so many questions but I am really new to Bayesian Modelling. Should I change the family argument for running a log-log model?

paul.buerkner · June 12, 2018, 8:42am

Do you have any reasons to use the clog-log link?

onoufrios · June 12, 2018, 9:00am

The model I am trying to implement is the SCAN*PRO model which is not a linear model but can be easily transformed into a log-log linear model. Thus, I was wondering if it is necessary to change the family or I should just keep it as it is now (gaussian)

paul.buerkner · June 12, 2018, 9:32am

Without more information on how the log model fits (in terms of pp_check() for instance), I cannot tell what you should do exactly with your data. In any case, since Iwe don’t know your data, it is very hard to give you any advice on specific modeling options (in particular to what would be “best”).

onoufrios · June 12, 2018, 9:48am

I will send you a subset of the data later on. I would appreciate it, if you can have a look on it.

onoufrios · June 13, 2018, 11:00am

Here is a small subset of the dataset (I have excluded some stores and some brands):
https://expirebox.com/download/f183be51711efb5e2519a0ae1ed00175.html

Souheyla_GHEBGHOUB · December 25, 2018, 11:12am

I have 1 DV , 9 IV, and 2 random effects. How long am I expected to wait for it to finish, please? Its taking a long time :(

paul.buerkner · December 25, 2018, 5:46pm

It is very hard to predict the running time of a Stan model so you won’t get any predictions from me even if you provided all the information that would determine the run time. In any case, the information you provided doesn’t tell much about the complexity of your model or data. For instance, the brms code as well as the number of observations are critical in determining the run time.

What is “a long time” in your understanding?

Souheyla_GHEBGHOUB · December 25, 2018, 5:55pm

Thank you Paul for your instant reply.
I have 2900 observations of 11 variables
True, I admit you have a point in what you explained and there is nothing I could do except keep waiting.

My definition for “long” is more than 12 hours.

paul.buerkner · December 25, 2018, 6:14pm

Hmm that’s long for 2900 observations. How does your model (i.e. the brms code) look like?

Souheyla_GHEBGHOUB · December 25, 2018, 6:25pm

My model for predicting change from pretest to posttest (levels = gain, no.gain, decline), using pretest and Group as IV and other covariates that affect learning.
145 participants, each with 20 outcomes on tested words (=2900)
modA <- brm(
Change ~ Pretest + Group + PoS + Verbal_freq + Band_freq + Imageability + Cognateness + Characters + Syllables + (1|Subject) + (1| Word),
data = wf.ChangeDV.PreCov,
family = ‘categorical’)
summary(modA)

paul.buerkner · December 25, 2018, 6:27pm

Family categorical often has sampling problems when using the default improper priors on the regression coefficients. Try out adding something like

prior = prior(normal(0, 5), class = "b")

Topic		Replies	Views
Running time for hierarchical model brms	11	3818	February 12, 2019
Brm running time Modeling fitting-issues	7	476	May 5, 2020
Categorical multi-level model performance brms	2	631	July 8, 2019
Posterior Predictions with a Wiener model - very slow? brms techniques , performance	1	473	October 28, 2021
Advice on distribution for modelling reaction times? (and setting priors) brms cognitive-science	8	1478	November 23, 2020

BRMS running time

Related Topics