Hi all,

I use Stan with brms in R.

My data is attached here:

sample_03_21_230828.csv (109.9 KB)

In the dataset, the distribution of the dependent variable has many zeroes and some negative values.

Its distribution is as follows.

```
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
-3.212e+09 0.000e+00 7.990e+06 1.694e+08 7.921e+07 1.698e+10 787
```

In the literature, most studies use logged values for this kind of variable, but due to negative values and zeroes, this is a challenge.

Some studies use the following formula to get logarithmic of absolute values and multiplying by -1. Something along the lines of the following code:

```
df %<>% mutate(
log_abs_value =
case_when(
value > 0 ~ log((value)/ 1e4),
value == 0 ~ log(1),
value < 0 ~ -log(abs((value)/ 1e4))))
```

This produces the following histogram of the dependent variable’s distribution.

I have a hierarchical model with a multivariate regression using brms.

In both cases, the dependent variable’s distribution does not seem to warrant a regression with Gaussian family. What would you suggest? I would particularly appreciate it if you could recommend example code and articles/ books/ blogs with more explanation.

Thank you in advance.