# Challenge with Formula Syntax

I am attempting to apply a Bayesian mixed model to estimate the impact of temperature on a home’s energy consumption.

``````priors <- c(set_prior("lognormal(0.01494647, 0.00937701)", class = "b", coef = "Temp"))

ins_month <- brm(use ~ Temp + (1+Temp|Month.f),
data = dpa.ins.home, family = gaussian(link="log"), warmup = 100, iter = 200, chains = 2, inits = "random", control = list(adapt_delta = .95, max_treedepth = 12),
prior = priors, cores = 2, sample_prior = TRUE))
``````

While I am comfortable with mixed-effects regression formulation, I am struggling with the details of the brms formula syntax specifically the priors and families . With the below information am I using priors and families correctly?

energy_use has a beta distribution

``````> descdist(ACH_test\$use)
summary statistics
----------
min:  0.1473333   max:  10.4126
median:  0.6733333
mean:  1.349991
estimated sd:  1.475063
estimated skewness:  1.857041
estimated kurtosis:  6.252937
``````

The prior for Temp is closes to a lognormal distribution

``````> descdist(temp\$T.Impact)
summary statistics
------
min:  0   max:  0.1714414
median:  0.01265037
mean:  0.01494647
estimated sd:  0.009377701
estimated skewness:  2.071458
estimated kurtosis:  12.71367
``````
• Operating System: Windows 10
• brms Version:

The formula syntax isn’t needed for priors or families. How you defined the prior there looks correct. You can check what the model used by using `prior_summary()` on the model.

You say that `use` has a beta distribution, but the model code `family = gaussian(link="log")`uses a gaussian family with a log link. To use the beta family, you should use `Beta()`. See `?brms::brmsfamily` for a description of the supported families.

When using family = Beta() I get an error:
Error: Family ‘beta’ requires response smaller than 1.response needs to be under 1.
Is there a way to mitigate this? I guess I could make the use MWh instead of kWh but then wouldn’t I have issues with the coefficients being really small?

There’s a zero_one_inflated_beta family: https://cran.r-project.org/web/packages/brms/vignettes/brms_families.html#zero-inflated-and-hurdle-models

This is still for between 0 and 1 values. Zero_inflated_beta is just when there is also zeroes and ones in the data. I have values from .001 to 9.6 to consider.

Then you don’t have a Beta distribution. Is it possible that you meant Gamma?

1 Like

Meant Beta because of the following. Am I missing something from this result?

This plot is only suggesting that the skewness and kurtosis of your data is compatible with a Beta distribution. But clearly a Beta distribution is not appropriate for your data as the support for beta is (0,1). Instead of (or in addition to) using such heuristic checks, you may also try to think about the generative process that you think gave rise to the observed data. As a first step, it is always helpful to observe the histogram of your responses.

Oh wow thank you for that clarification regarding the plot! The observed data is the energy use of a home. When you say the generative process do you mean what creates the energy use of a home? If so how does that translate to understanding the distribution of the data?

​For example, you may think that energy use depends on many factors with unknown mean and variance, and that all factors together multiplicatively produce the outcome. In such a case, you should perhaps be inclined towards choosing a lognormal distribution.

1 Like