Hi, I have some information about my data that I do not quite get how I can introduce into my model. The procedure that would seem simplest is to use varying lower bounds on my parameters, but I am not finding a way to do it, with the solutions I read seeming not applicable.
I am working with COVID data, and each data row represents a country. For each country/row, I have an estimate of the proportion of people that had COVID, obtained from seroprevalence studies. I also have a lower bound on the number of people that got COVID in each location, coming from the number of reported cases (called tested^i)
To introduce the estimated prevalence from the serosurveys, I use as input data the values \alpha^i and \beta^i that define a Gamma distribution over prevalence values for each location i, which I obtain from each serosurvey.
prevalence^i \sim Gamma(\alpha^{i}, \beta^{i})
But due to the large uncertainty in seroprevalence studies, I often have that the distribution over prevalence^i goes below tested^i.
This is all related to estimating the lethality of COVID-19 using the number of deaths for each location deaths^i. A very simplified version of the model would look like this
data {
int<lower=0> N; // number of observations
vector<lower=0>[N] seroprevShape; // Prevalence gamma shape
vector<lower=0>[N] seroprevRate; // Prevalence gamma rate
vector<lower=0>[N] deaths; // outcome count
vector<lower=0>[N] nPop; // population at the country
}
parameters {
// Countries outcome fit
real<lower=0, upper=1> lethality; // mean lethality
real<lower=0, upper=1> lethalitySd;
vector<lower=0, upper=1>[N] prevalence_raw;
}
model {
lethality ~ normal(0.5, 0.1);
lethalitySd ~ normal(0.5, 0.1);
prevalence_raw ~ gamma(seroprevShape, seroprevRate);
for (n in 1:N) {
deaths[n] ~ normal(nPop[n] * prevalence_raw[n] * lethality, lethalitySd);
}
}
Now, what I would like to do is to be able to introduce tested^i for each location as a lower bound to prevalence\_raw^i. The method I have seen suggested elsewhere is to just introduce a vector of lower bounds and add it to prevalence_raw. But this seems to me like it would completely change the estimated prevalences I obtained from data.
Another thought I have, is maybe using a distribution other than Gamma, and then to choose the distribution parameters so that the new distribution have the intended distribution (i.e. same expected value and variance as the serosurveys) after adding the lower bounds tested^i.
I hope this is not too confusing. Any help or suggestions would be greatly appreciated.