I have a similar problem and I think option 3 might make the most sense.
The reason is that changing the standard deviation/variance will actually implicitly weight each observation. To give an intuition, suppose you have one observation y and you put n different sampling statements on y.
So then, you would have y ~ N(\mu_i, \sigma_i) where i = 1,2,...,n and \sigma_i is the variance of the i^{th} sampling statement. Then let g_i = \sigma_1 * \sigma_2 * ... *\sigma_{i-1} * \sigma_{i+1} *...*\sigma_n. Then, the relative weighting of the i^{th} observation is \frac{g_i}{g_1 + g_2 + .... + g_n}. Thus, increasing the variance of a sampling statement will increase the denominator but not increase the numerator and hence reduce its ‘impact’. You can easily create toy examples in STAN that mimic this setup or you can do the math yourself and you will get this.
I’m sure the dynamics/exact formula will be slightly different with n different observations, but the logic should still follow.