Creating rate from data and estimated parameter

I am trying to model a rate as a function of a know numerator, and a denominator with error. When I run my program, I get the same estimate for my rate for all my observations. I would prefer to sample the rate for each of the observations.

I’m trying to use the measurement error model:

x_{true_i} \sim \text{Normal}(x_{observed_i}, \sigma_{observed_i})

then create my rate as:

\text{rate}_i = \text{observed}_i / x_{true_i}

But I only get one estimate for \text{rate}_i not one for each observation

Please share your Stan program and accompanying data if possible.


When including Stan code in your post it really helps if you make it as readable as possible by using Stan code chunks (```stan) with clear spacing and indentation. For example, use

data{
int Nw;
vector[Nw] deaths;
vector[Nw] n;
vector[Nw] nsd;
}
parameters{
//real mu_n;
//real<lower=0> sig_n;
real theta_n[Nw];
}
model{
theta_n~normal(n, nsd);


}
generated quantities{
real rate[Nw];
for(i in 1:Nw){
rate[Nw]= deaths[Nw] ./theta_n[Nw];
}
}

st.dat = list(Nw=252L, deaths = c(536, 123, 727, 331, 78), n = c(42959, 15490,
68827, 21538, 8260), nsd = c(21669.0214651152, 5749.0795144234,
44115.5037814867, 11531.0423423287, 2115.87251751603))

I’m getting output of:
mean se_mean sd 2.5% 25% 50% 75%
rate[3] -0.009848539 0.01888375 0.8445071 -0.06281141 0.005117066 0.007575042 0.01257936
rate[4] -0.009848539 0.01888375 0.8445071 -0.06281141 0.005117066 0.007575042 0.01257936
rate[5] -0.009848539 0.01888375 0.8445071 -0.06281141 0.005117066 0.007575042 0.01257936
rate[6] -0.009848539 0.01888375 0.8445071 -0.06281141 0.005117066 0.007575042 0.01257936
rate[7] -0.009848539 0.01888375 0.8445071 -0.06281141 0.005117066 0.007575042 0.01257936
rate[8] -0.009848539 0.01888375 0.8445071 -0.06281141 0.005117066 0.007575042 0.01257936

All the same. Any ideas would be most welcome

I believe you have some bug in your code and what you posted here isn’t actually the code you are running - I tried to run Stan with exactly the model and data you posted and I get

Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) : 
  Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=deaths; position=0; dims declared=(252); dims found=(5)  (in 'model39ec689f2bc9_bbe6cfd48f2b12ab474cbd7d593d1798' at line 3)

failed to create the sampler; sampling not done

Which is to be expected as Nw does not capture the length of the other input arrays as it should.

Further, your generated quantities has a bug:

for(i in 1:Nw){
rate[Nw]= deaths[Nw] ./theta_n[Nw];
}

should probably be

for(i in 1:Nw){
rate[i]= deaths[i] / theta_n[i];
}

which would expalin why you get the same rates.

Alternatively, if you switch all your real arrays to vectors, you can use

generated quantities{
  vector[Nw] rate;
  rate = deaths ./ theta_n;
}

Note also that what is usually described as measurement error model would be something as
x_{observed_i} \sim \text{Normal}(x_{true_i}, \sigma_{observed_i}) \\ x_{true_i} \sim \text{Some prior}
i.e. the dependency would go in the other direction.

1 Like

Thank you very much, I got the results I was expecting by changing things to vectors:


data{
int Nw;
vector[Nw] deaths;
vector[Nw] n;
vector[Nw] nsd;
}
parameters{
vector[Nw] theta_n;
}



model{
theta_n~normal(n, nsd);

}

generated quantities {
vector[Nw] rate;
 rate = deaths ./ theta_n;
}


-Corey