Weighted regression using number of observations

stefank · December 4, 2017, 2:06pm

Hi all,

I am running a model predicting mean_y of an individual, calculated from multiple observations y of that individual.

for(i in 1:N_obs ){
pred[i] = a[i] + b[i] * x[i];
mean_y[i] ~ normal (pred[i], sigma[i]);
}

Every individual is measured a certain number of times. Now, I would like to weight each individual according to the number of observations, using some kind of weight[i]. I was thinking to adapt the model like this:

for(i in 1:N_obs){
pred[i] = a[i] + b[i] * x[i];
target += normal_lpdf(mean_y[i] | pred[i] , sigma[i]) * weight[i];
}

However, I read in the linked discussion that using weights is generally not recommended, as it ‘is not a generative model’:
https://groups.google.com/forum/#!topic/stan-users/v4CoBWUehwU

In the same discussion, it is said that it can be modeled if variances vary between observations, which, if I understand it well, is the case for my individuals: when there are more observations per individual, the variance generally decreases.

Now, I was wondering if

I understood the discussion well and that what I want to model is possible and appropriate
if so, how to implement this in my model.

It is important to note that my individuals are indeed measured repeatedly over time, but that I don’t want to weight more recent observations ‘heavier’ than previous ones. I just want to weight according to the number of observations, or some sort of measure related to it.

Thank you for your help!

Dalton · December 4, 2017, 10:58pm

Do you actually have an observed measurement for each instance an observation unit is measured? Or do you only have the observed mean of multiple measurements and information on how many measurements were made? If you have an observed measurement for each instance for each individual then there is no need to use weighting. Rather a multilevel model would be more appropriate.

stefank · December 22, 2017, 12:35pm

Thanks for your answer. You were right, I did not need to use weighting in the end, I have used a multilevel model instead.

Bob_Carpenter · January 9, 2018, 6:20am

Yes, that’s right. This collects the sufficient statistics (number of observations) and is more efficient than just iterating over all the observations. There’s a section on this in the efficiency chapter of the manual.

That’s when they’re part of the model and not part of the generative story. Here, they’re just part of an efficiency improvement for the implementation.

Topic		Replies	Views
Why shouldn't I use weights? General	6	639	January 24, 2020
Weighting observations and arviz.loo Modeling techniques , loo , arviz	13	1240	February 8, 2019
Struggles with Survey weighting and Regression Modeling General	6	565	March 24, 2020
Survey weighted regression Modeling	34	8953	May 27, 2022
The effect of weights on the resulting estimates Modeling	1	581	May 22, 2020

Weighted regression using number of observations

Related topics