BRMS running time

fitting-issues
performance

#1

Dear all,
I am trying to predict the sales of different SKU by using a log-log model. However, I should also capture the heterogeneity among the different brands (I would like to have a different price elasticity coefficient for each different brand). When I run the model without random effect term it takes less than 10 minutes. After including the random effect term it takes more than 8-9 hours to finish. The problem is that I have to include more (random effect) terms so I cannot imagine how worse it will be. Am I doing something wrong? Here is the formula I used:

brm(formula = log(npack) ~ (log1p(sales_price_pack)|brand) + log1p(wd) + log1p(wd_disp),cores=parallel::detectCores() , chains = 2, warmup = 500, data = temp)

Should I try to define a more restrictive prior for the price since we know that the coefficient has to be negative?


#2

I doubt the model is putting positive mass on price elasticities that are positive. My guess is that the model just does not fit that well and has to use a very small stepsize to get around the posterior distribution you have defined.


#3

I dont think so that the model is not appropriate. It’s a very basic log-log model for predicting sales. Any other ideas? @paul.buerkner


#4

your model is missing a population level (fixed) effect of log1p(sales_price_pack). as a rule of thump, every group level effect should have a corresponding population level effect.

also please use the brms tag for brms related questions. otherwise I may overlook them sometimes.


#5

Thanks a lot Paul. It seems that i runs much faster now. Is it normal to take approximately two and a half hours just for one random effect term?


#6

There is no general rule. It depends on the size of your data, the number of levels of your grouping factor and the overall model complexity (and the fit of model to data).


#7

I see. Stupid question : is it necessary to transform the group variable into factor?


#8

you don’t need to transform it.


#9

Sorry for asking so many questions but I am really new to Bayesian Modelling. Should I change the family argument for running a log-log model?


#10

Do you have any reasons to use the clog-log link?


#11

The model I am trying to implement is the SCAN*PRO model which is not a linear model but can be easily transformed into a log-log linear model. Thus, I was wondering if it is necessary to change the family or I should just keep it as it is now (gaussian)


#12

Without more information on how the log model fits (in terms of pp_check() for instance), I cannot tell what you should do exactly with your data. In any case, since Iwe don’t know your data, it is very hard to give you any advice on specific modeling options (in particular to what would be “best”).


#13

I will send you a subset of the data later on. I would appreciate it, if you can have a look on it.


#14

Here is a small subset of the dataset (I have excluded some stores and some brands):
https://expirebox.com/download/f183be51711efb5e2519a0ae1ed00175.html