Hey folks, over the holidays I managed to make substantial progress on a side project I’ve had on the go for years now, but now I’m back to work and don’t have the bandwidth to do some last work to publish it, so I’m hoping to catch the interest of someone to jump on as first author to run some last checks/demos and write. I’ll post a description of the project below, and please DM if you’re interested.
The project seeks to improve analysis of human choice response-time experiments, specifically those scenarios where one is not collecting enough data to do a full process model (like diffusion, lba, etc), but still want at least a “descriptive” model. With response time and response accuracy as observables in each of many human participants, it’s trivial to model the influence of predictor variables on accuracy using a hierarchical model of log-odds, and similarly straightforward to use a hierarchical gaussian model of RT (usually log-transformed RT). In the latter, one can either explicitly model only the influence of predictor variables on the location of the Gaussian, treating potential variance of the scale of the Gaussian as nuissance (i.e. fully-pooled or unpooled), but there’s good theoretical motivation for taking the effort to explicitly model the influence of predictor variables on scale in the same way as done for location.
It’s pretty easy to do all of the above with an extension of the SUG 1.13 to include simultaneous inference for the effects of predictors on log-odds, rt-location & rt-scale, but with any but the most trivial experimental designs, this quickly leads to pathology. Specifically, the use of the multivariate normal to express/achieve mutual information between predictors works acceptably in the case of truly multivariate-normal-sampled latents, but it’s hard to justify a monolithic mvn for the three-class latents in this scenario, and it can be demonstrated that when the correlations are misspecified in this way, there is substantial bias to zero, in turn making the use of the multivariate normal self-defeating when attempting to let parameters mutually-inform, giving you wrong answers for inference on correlations and leaving information on the table when it comes to reducing uncertainty in other parameters.
My solution is to use the multivariate normal only for those correlations that do not have strong theoretical prior expectation as non-zero, and for those that do have such expectation, encode them separately via a simple structural equation model. Specifically, I model the individual-level effects on RT-location with a multivariate normal, as usual, then model the RT-scale and log-odds-of-error effects as an SEM related to the RT-location effect (so, intercept_rt_scale ~ sem(intercept_rt_loc,r1); effectA_rt_scale ~ sem(effectA_rt_loc,r2); ...
). A nice corollary of this framing is that it then becomes trivial to repeat the SEM structures to achieve explicit inference on reliability in the context of test-retest style data ([intercept_rt_loc_test1 , intercept_rt_loc_test2] ~ sem( intercept_rt_loc, rel1);
).
I have decently-validated/performant code for this (both stan and use-from-R demos), and I’m getting expected results when applied to real data. Other than writing, what’s left is to double-check the implementation with a parameter-recovery test using hand-generated data, some examples showing performance comparing this new approach to the standard monolithic multivariate, and maybe running it through SBC to double-check that this model structure doesn’t have inherent topological pathologies that could be expected to trip-up HMC.
Anyway, DM me if you’re interested!