Suppose we have so much data that running a Stan model in one chain is not feasible. What about the approach of (randomly) splitting the data into, say n subsets of equal size that are amenable to simulation. For each subset we can then obtain a sub-posterior for the parameters of interest (using a number of independent chains). How would one then go on and combine these n sub-posteriors? Is this a valid approach, has someone used this with Stan before? Any experiences and thoughts?
A naive method would be to simply merge the parameter samples of the n sub-posteriors (assuming equal number of samples). However I think this seems to simple to be true (and some reweighting might be required?)