I would like to model two response variables (y1, y2) using brms multivariate syntax to (1) model each response with a different distribution family, and (2) model the random effects g as correlated. The number of obs for y1 and y2 differ.
According to the brms multivariate vignette, it seems that we need the same number of observations for each component in the formula (in the vignette tarsus and back are two columns in the dataset). Can this assumption be relaxed? A previous user raised the same question here, but the suggested solution does not make specifying different families possible.
Sorry, short on time, so just a quick notice - If I understand the problem correctly the subset term should let you use only some rows for each outcome (see the help for brmsformula for details).
I was hoping that I could have a different number of observations for two different outcomes and still fit the models in brms using the multivariate normal. But, rescor still requires the outcomes to have the same number of observations :( I need rescor = TRUE so I can model the covariances between predictors and residuals.
For now, Iāll keep trying to do this in Stanā¦ but itās still beyond my skill levelā¦ I might be ready to give it another few-month breakā¦
Haha, sorry to be bearer the bad news! Well, if you figure out a solution for rStan, be sure to let me know! :) (or find a vignetteā¦ but Iāve searched high and low for oneā¦)
Ya, Iāve thought of that, but the data arenāt truly āmissingā. I do behavioral experiments where the unbalanced nature is occurs as part of the experiments. e.g., if Iām typically only interested in correct trial data (response times). The number of errors is never constant across participantsā¦
But does that matter if it is a dependent variable? When you extend the model as in the example there, my understanding is that the āmissing valuesā are treated as parameters - so for missing outcome variables it is similar to a doing a prediction for new data. If you want you could make a model with this parameters included in order to include your uneven outcomes, and then ignore these parameters when interpreting your resultsā¦ I donāt think it can effect your coefficients other than by allowing you to include all the data
I have unequal observations, because I look at response time data based on response accuracy (and only trials where a participant actually responds.
For example, one person could have 300 correct trials and 20 error trials and another could have 290 correct trials and 25 error trials (with 5 discarded non-response trials).
My hope is to get an identical output to using
bf(congruent ~ 0 + (1+trial|p|id)) +
bf(incongruent ~ 0 + (1+trial|q|id) + rescor(TRUE)
In this case it would be something like
bf(correct ~ 0 + (1+trial|p|id)) +
bf(error ~ 0 + (1+trial|q|id) + rescor(TRUE)
That output of this brms fit has all of the (co)variance components that I need, and none that I donāt :) I donāt know if that helps to clarify what Iām afterā¦ I can only get it to work if I have the same number of correct and error trials across participants.
@jroon That is an interesting ideaā¦ maybe the issue is that I donāt understand whatās happening when mi() is used, as inā¦ bf(congruent | mi() ~ 0 + (1+trial|p|id))
I couldnāt figure out how to get the covariance between residual variances for the two model typesā¦ :)
I want the sd(correct_Intercept), sd(error_Intercept), sd(correct_trial), sd(error_trial) and the covariances among them. I also need the sigma_correct, sigma_error, and rescor(correct, error).
If I understand you correctly you mean make one long format output variable and an indicator variable to say which outcome in the wide format that each row corresponds to. Then when you fit that model all outcomes are considered to come from the same univariate distribution (e.g. normal) with the indicator having a moderating effect, whereas in the multivariate version the outcomes are considered to come from a multivariate normal ?
Iāve seen this done in e.g. jags models in the past - but Iāve never understood are these things truly equivalent.
You can only define a correlation between two variables when there is a one to one correspondence between values. This is not a brms limitation but how correlations work. So i am unsure how you would define this residual correlation in the first place.
Hm, all participants have observations for each event type, so itās not like any participants donāt have a person-specific estimate of correct or error trial. Iām interested in the group-level effects, so I figured as long as everyone had at least a few good observations of each trial type it should be doableā¦