It is quite common for providers of official statistics (at least in Sweden) to provide standard errors (or more often margin of errors) for proportions. I like to play around with that kind of data e.g. estimating time trends and such. But I want to discuss how to model data of this kind in a good way (mainly with regards to range of domain of proportions).
Let say I have a small dataset like this:
tibble::tribble(~year, ~proportion, ~se,
2009L, 0.249, 0.0168367346938776,
2011L, 0.306, 0.0178571428571429,
2013L, 0.369, 0.0168367346938776,
2015L, 0.418, 0.0178571428571429,
2017L, 0.495, 0.0214285714285714)
(standard errors calculated from the margin of error by dividing by 1.96)
What I have had in mind recently is a Gaussian family with a logit link, which is not supported by default in brms, so I have to resort to a non-linear formula:
brm(data = data_object,
formula = bf(proportion|se(se) ~ inv_logit(eta),
eta ~ 1 + s(year, k = 4),
nl = TRUE),
control = list(adapt_delta=0.99),
family = gaussian()) ->
meta_proportion_model
This approach seems to work with regards to the domain of proportion (between 0 and 1), since fitted values far outside of the data range is between 0 and 1 (albeit not predictions, of course) but I fear it is a bit sub-optimal in comparison to a Beta regression model. I donāt know the Beta distribution parametrization used in brms by heart, but I have modeled data of this kind before by deriving the dispersion parameter and used an offset to fix it. I am also a bit unsure whether this is the same thing as using a logit link, but I suspect it is.
How would you approach āmeta-dataā like this?
EDIT: The discussion if of course extendable to other domain restricted statistics with standard errors.