I have some data that appears to be distributed as the mixture of many exponentials. I’d like to build a simple model of 3-components for the purpose of simulating realistic data.
Stan fits this model easily, but I’m not satisfied by the right tail, which assigns too low a probability to extreme values. Specifically, the 99th percentile is too low relative to the actual data.
Is there any good way to place emphasis on the accuracy of modeling these extreme values within the framework of Stan? Does it make sense to just naively weight the likelihood based on the data values?
An alternative I’ve tried is creating a loss function of hand-chosen percentiles and fitting the model using black-box optimization (I guess this is called quantile-matching?). I get a better tail fit, but convergence is poor and requires a lot of tuning, so I’d rather stick with Stan MCMC.