Removing outliers when fitting shifted_normal distributions to reaction time data

Alasdair_Clarke · November 24, 2020, 11:37am

Hello,

this I thought I’d branch this off from the other topics asking specific questions about fitting shifted_normal distributions (i.e. here)

A related topic is: what is current best practice for dealing with outliers? If I understand these models correctly, any reaction times that are shorter than the fitted “shift” parameter are impossible. In my current dataset, there are a few very short RTs (<50ms) which are very likely to be random button presses. For the time being, I’m excluding all RTs falling below the 1% quantile, or above the 99% quantile.

Does anybody have any better advice? I was wondering if it would be a good idea to fit a mixture of distributions, for example, on any given trials there is a small chance that the RT will be drawn from a uniform(0, xmax) distribution for some value xmax.

Has anybody had much success with this approach?

Thanks

martinmodrak · December 5, 2020, 9:37pm

Sorry for not getting to you earlier.

One way to think about outliers is that they are evidence that your model does not capture the full process that generated your data. So you are facing a choice here: either you believe those short / long RTs are 100% errors and discard them or you need to admit that shifted lognormal is actually not a good fit for your data.

This is exactly what I would suggest. I don’t think the heuristic you use is necessarily bad (it is certainly easy to implement). But I find the mixture approach to be theoretically more appealing. The bonus is that the mixing probability will give you an estimate of the rate of errors. Obviously you may believe that a differetn distribution is more appropriate for the errors (e.g. an exponential). In any case, what were “outliers” before is now well within our model.

And you are in luck: brms supports mixture models. The bad news is that mixtures can be sometimes problematic to fit.

Best of luck with your model!

Alasdair_Clarke · December 10, 2020, 11:33am

This pretty sums out my feelings and fears! I’d like to do the mixture model, but worry that it would be a timesink, and would probably have little impact on my overall results. I enjoying modelling and analysis, but it is very easy to go down rabbit holes for days, rather than just getting on with writing up the paper!

:)

mingqian.guo · December 13, 2020, 7:33pm

In drift diffusion model area, Ratcliff and Tuerlinckx (2002) use the same method(assume RT is drawn from the mixture of a uniform distribution and a Wiener process), and they found that 1%-5% responses are outliers( they call them ‘contaminates’).

Ratcliff, R., & Tuerlinckx, F. (2002). Estimating the parameters of the diffusion model: Approaches to dealing with contaminant reaction times and parameter variability. Psychonomic Bulletin & Review, 9, 438–481.

Topic		Replies	Views
When to use shifted log-normal distributions Modeling specification , brms	4	3329	April 16, 2021
Advice on distribution for modelling reaction times? (and setting priors) brms cognitive-science	8	1737	November 23, 2020
Reaction time Gaussian and Lognormal distribution two different results General techniques , fitting-issues , specification , brms	2	693	February 10, 2022
Shifted lognormal initials with brms Modeling fitting-issues , specification , cognitive-science , brms	1	874	January 21, 2022
Fitting brm shifted lognormal model for reaction times Modeling fitting-issues , brms	3	79	January 29, 2025

Removing outliers when fitting shifted_normal distributions to reaction time data

Related topics