Great. So here’s an example - let’s say we measure threee song lengths for three mice and we have the model `Resp ~ (1 || ID)`

. This model has three “main” parameters: `Intercept`

(which we will not care about here), the “global” or “residual” variability `sigma`

and the “between-mice” variability `sd_Intercept_ID`

. Let’s say our data looks like this:

```
Example 1:
MouseA: 13 12 14
MouseB: 25 22 21
MouseC: 31 28 30
```

Without even running the model we can say that between-mice variability `sd_Intercept_ID`

is much larger than the residual variability `sigma`

.

But the data could have looked differently:

```
Example 2:
MouseA: 13 28 21
MouseB: 25 12 30
MouseC: 31 22 14
```

Here, the mice actually don’t differ much from each other, so `sd_Intercept_ID`

will be smaller than `sigma`

which becomes large to encompass the measurements change notably even within the same mouse. Makes sense so far?

The point is that in both examples, the first observations are exactly the same, so if we observed just the first column, we are unable to distinguish between “large `sd_Intercept_ID`

, small `sigma`

” and “small `sd_Intercept_SD`

, large `sigma`

” (and everything in between).

Technical aside for completeness: if you observe only one value per mouse, the model only informs the total sd which is `sqrt(sd_Intercept_SD^2 + sigma^2)`

, and that’s why the `pairs`

plot for the two looks roughly like a circle arc. Also, for some non-gaussian families you could - at least in principle - identify `sd_Intercept_ID`

even when you have only one observation per individual, because the gaussian varying intercept can create a somewhat different variability pattern than the variability introduced by the family, but for gaussian family (and few others) the case is hopeless even in theory, because (with a bit of sloppy notation) `normal(normal(mu, sd1), sd2)`

is *exactly* the same as `normal(mu, sqrt(sd1 ^ 2 + sd2 ^ 2))`

.

Now for your data, it might actually make sense to take a different approach for each of the responses: The song length is not fixed per individual, there is (I assume) substantial within-individual variability and you have multiple measurements of the song length so you can identify both the “within-individual” and “between-individual” variability. For body size, I would expect much smaller within-individual variability (although there probably is some due to measurement imprecisions, and I’ve heard people’s height changes slightly over the course of day so mice’s probably do so as well). While you can’t directly quantify the within-individual variability as you have only a single measurement, I guess you can easily put some quite strict bounds on it using your knowledge of the domain. In a single-response model this could be achived by putting a narrow prior on `sigma`

.

There would be some technical challenges for putting both in a single `brms`

model. A slightly sub-optimal but probably easiest would be to use the `se()`

addition term. You would have `Resp1 | se(error) ~ ...`

This effectively fixes the `sigma`

at a specific value separately for each row in the dataset which will no longer be estimated. The `error`

represents a column in the data containing the standard error of the mean (`sd(x) / sqrt(n)`

), so for song length, you would put average song length as the response and the observed standard error of the mean as `error`

. For body size you would put the single measurement as the response and put the theoretically derived measurement error as `error`

. (this paragraph is pure speculation on my side, I’ve never built such a model, but I *think* it *should* work).

If your responses are all positive, than the natural trasformation would IMHO be be taking the log (and potentially scaling then, but that might not be necessary). The log is also likely to reduce the skew, so you might be able to get away with `gaussian`

family and use `rescor`

(which I now believe to be quite beneficial).

As I said, this will change the interpretation of the coefficients but I think that this is actually more natural - say you get estimate of `sd_intercept_ID`

roughly 5 for a model on the original scale: this means that between-mice variability is something like +/- 2*5. If the mean population song length is say 30, this is unproblematic but what if the mean song length is 8, this would imply that some mice have average song length of -2 …

If you work on the log scale and you get estimate of `sd_intercept_ID`

roughly `0.55 ~= log(3)/2`

, this means that the between-mice variability is roughly between “the song is shorter by a factor of 3” and “the song is thirce as long” which makes sense regardless of the population mean…

Does that make sense?