I am attempting to apply a Bayesian mixed model to estimate the impact of temperature on a home’s energy consumption.
priors <- c(set_prior("lognormal(0.01494647, 0.00937701)", class = "b", coef = "Temp"))
ins_month <- brm(use ~ Temp + (1+Temp|Month.f),
data = dpa.ins.home, family = gaussian(link="log"), warmup = 100, iter = 200, chains = 2, inits = "random", control = list(adapt_delta = .95, max_treedepth = 12),
prior = priors, cores = 2, sample_prior = TRUE))
While I am comfortable with mixed-effects regression formulation, I am struggling with the details of the brms formula syntax specifically the priors and families . With the below information am I using priors and families correctly?
energy_use has a beta distribution
min: 0.1473333 max: 10.4126
estimated sd: 1.475063
estimated skewness: 1.857041
estimated kurtosis: 6.252937
The prior for Temp is closes to a lognormal distribution
min: 0 max: 0.1714414
estimated sd: 0.009377701
estimated skewness: 2.071458
estimated kurtosis: 12.71367
- Operating System: Windows 10
- brms Version:
The formula syntax isn’t needed for priors or families. How you defined the prior there looks correct. You can check what the model used by using
prior_summary() on the model.
You say that
use has a beta distribution, but the model code
family = gaussian(link="log")uses a gaussian family with a log link. To use the beta family, you should use
?brms::brmsfamily for a description of the supported families.
When using family = Beta() I get an error:
Error: Family ‘beta’ requires response smaller than 1.response needs to be under 1.
Is there a way to mitigate this? I guess I could make the use MWh instead of kWh but then wouldn’t I have issues with the coefficients being really small?
This is still for between 0 and 1 values. Zero_inflated_beta is just when there is also zeroes and ones in the data. I have values from .001 to 9.6 to consider.
Then you don’t have a Beta distribution. Is it possible that you meant Gamma?
Meant Beta because of the following. Am I missing something from this result?
This plot is only suggesting that the skewness and kurtosis of your data is compatible with a Beta distribution. But clearly a Beta distribution is not appropriate for your data as the support for beta is (0,1). Instead of (or in addition to) using such heuristic checks, you may also try to think about the generative process that you think gave rise to the observed data. As a first step, it is always helpful to observe the histogram of your responses.
Oh wow thank you for that clarification regarding the plot! The observed data is the energy use of a home. When you say the generative process do you mean what creates the energy use of a home? If so how does that translate to understanding the distribution of the data?
For example, you may think that energy use depends on many factors with unknown mean and variance, and that all factors together multiplicatively produce the outcome. In such a case, you should perhaps be inclined towards choosing a lognormal distribution.