A design question, alternative to a nested design


This is more of a design question than brms specific. I have minor versions of a software nested within major versions . The data I have has versions like 5.0b1,5.0b2,6.0b1,6.0b2, 6.0b3 and so on where major is 5 and 6 and minors are b1,b2 and b3. Note that different majors might have different number of minors (as in the example). We have a few data points per major:minor combination.

My initial approach was considering minor nested in major e.g (1|major/minor) and this is equivalent to (1|major) + (1|major:minor). If my understanding of the above is correct this implies major follows (let’s assume) normal(0, major_sigma) and major:minor follows normal(0, major:minor_sigma).

What I think happens here is rather the minors within a major have their own sigma that is minors_within_major_1 ~ normal(0,major_1_sigma), and
minor_within_major_2 ~ normal(0,major_2_sigma),

Is such a thing possible using BRMS or would I need Stan for this?


1 Like

Good question! I’d add the “brms” tag to this, as it really comes down to a question about how the brms formula syntax works.

1 Like

I don’t think this is immediately available in brms, but unless you have many versions, you can IMHO work around this by expanding the data, so instead of version column you’ll have 0-1 numeric columns like is_version_5,is_version_6 etc. and minor_version_5,minor_version_6 etc. Then your formula could be (1||is_version_5:minor_version_5) + (1||is_version_6:minor_version_6). Note that since is_version_5 is 0 for all version 6 rows, it doesn’t matter what you fill out as minor_version_5 for version 6 (but it can’t be NA).

Also note, that by default, using single |, brms will estimate a correlation matrix between the varying effects. Using || avoids this and makes the effects independently normally distributed. Theoretically, it should be possible to get a similar effect by using a suitable prior for the correlation matrix (assuming that minor versions of the same major version are more strongly correlated than between major versions), but as of know, only lkj prior is available out of the box and the prior is completely symmetric, so you would have to write your own prior distribution to achieve that.

As I am not an brms expert, I’ll ask @paul.buerkner to confirm that there is not a better way and that my proposal should actually work.

Hope that helps.

I was under the impression that the grouping variable should have some minimum number of levels to infer the group variance. In the above formulation, the dummy variables would have 2 levels.

Not really - I’ve worked with models having just two levels and it worked fine - in terms of predictions, there was (in my experience) actually quite little difference between having things like x ~ binaryVariable and x ~ (1||binaryVariable) (but your posterior for group variance will be very wide).

Oh, I realized I written it wrong, what you want is more likely to be: (is_version_6||minor_version_6) etc., so you get a single coefficient for each level minor_version_6 and the coefficient gets multiplied by 1 for rows that actually are version 6 and by 0 for others (making it effectively ignored for those rows and those rows would not contribute towards its estimates).

Does that make sense?