Measurement error model with a large sample?

Ilan_Strauss · April 10, 2019, 4:44pm

Hi,

I have a large sample of data – around 360,000 data points. I am trying to fit a simple measurement error model to start. Though ideally would like to fit this measurement error model within a hierarchical model structure.

Given the Bayesian measurement error treats each data point as an unknown parameter this becomes untenable to estimate (i.e. it takes days to compute) given our sample size.

What would you recommend?

Thank you,

Ilan

martinmodrak · April 24, 2019, 6:32pm

This is very large indeed. In my experience it is however OK to split such large datasets randomly into smaller chunks that are manageable for Stan, you just need to check that the estimates of the shared parameters are similar across the chunks (which they will be, unless there is some problem with your model).

With the latest Stan additions, you just might be able to push your model to feasible region with all the datapoints, using map_rect (with MPI and/or threading) or offloading some of the computation to GPU (since 2.19, not yet in rstan). I however fear that this might be non-trivial to get working as the documentation is not quite there yet.

Alternatively, you check if you are able to express your model in INLA - it seems this should be possible (a good starting point might be http://www.r-inla.org/examples/case-studies/muff-etal-2014), and you can easily get an order of magnitude speedup. Although INLA does not have such a great support as the Stan ecosystem, it is still a mature software and I can recommend it. The solution is however only approximate, so if you can afford it, it is useful to test your model with Simulation-Based Calibration (I have had one occasion where INLA was moderately mis-calibrated for a hyperparameter)

Ilan_Strauss · May 1, 2019, 8:35pm

Thanks for this. Very helpful. Anyway to do INLA in R Stan - BRMS interface I wonder?

martinmodrak · May 2, 2019, 7:31am

Not really, INLA is a completely separate piece of software, with its own idiosyncracies and learning curve. Though the R INLA interface is based on formulas and does not require you to learn a new language.

There is a talk about using some of the ideas from INLA to get efficient approximate density computations in Stan, but AFAIK this has always been scheduled to “someday” and no real progress happened in the past year or so.

Topic		Replies	Views
Can we estimate MRP using INLA Modeling techniques , fitting-issues	1	122	June 16, 2025
Blog: A gentle Stan to INLA comparison Publicity	18	6563	June 5, 2018
Measurement Error Modeling Modeling specification	4	2336	January 9, 2018
Help with simple measurement model, doesnt sample Modeling	14	1862	December 13, 2019
Is it possible to run Bayesian hierarchical model with 10million observations? General rstan , hierarchical-model , brms	20	3141	June 28, 2021

Measurement error model with a large sample?

Related topics