Sample size as function of needed HDI width


#1

Hello, everybody.

I’ve done some exploratory studies about the weight perception error of fish consumer in my region. From these studies, I model mu 95% HDI width (HDIwid) as function of sample size (N) in an exponential relationship ( HDIwid = beta0 * N ^ beta1 ). Below is the Stan model (notice HDIwid=Y and N=X):

data {
  int<lower=1> Ntotal;    // number of simulated samples
  real Y[Ntotal];         // HDI width
  int<lower=1> X[Ntotal]; // sample size (must be integer)
  real Ysd;               // for y model
  real beta0ct;           // intercept central tendency
  real beta0sd;           // intercept standard deviation
  real beta1ct;           // power central tendency
  real beta1sd;           // power standard deviation
}
parameters {
  real<lower=0> beta0 ; // intercept location
                              // unless beta1 is always integer, beta0>
  real<upper=0> beta1 ; // power location
                              // unless sample size increase HDI width, beta1<0
  real<lower=0> nu ;    // prediction normality (just a precaution against outliers)
                              // scale parameters seems of little use
}
model {
  beta0 ~ normal( beta0ct , beta0sd ) ;
  beta1 ~ normal( beta1ct , beta1sd ) ;
  nu ~ exponential( 1/30.0 ) ;
  for ( i in 1:Ntotal ) {
    Y[i] ~ student_t( nu , beta0 * pow( X[i] , beta1 ) , Ysd ) ;
  }
}

If used directly, this model estimate the uncertainty of HDIwid for a given N. However, I think the inverse is more interesting for future studies: estimate the uncertainty of N for a needed HDIwid.

Question: What would be the Stan model for N ~ HDIwid?

Observations: Actually, I’m doing this by inverting the model coefficients ( beta1Inv = 1/beta1 ; beta0Inv = (1/beta0)^beta1Inv ; N = beta0Inv * HDIwid ^ beta1Inv ) outside Stan. However, since N is a count, I think its resulting uncertainty is incorrect. I tried to invert the above Stan code and use Poisson distribution for N, but it was a miserable fail.

Thanks in advance and sorry for my bad english.


#2

That’s a strong prior on nu toward fat-tailed postriors.

I don’t undertand why the 95% intervals depend on sample size. They depend on the model fit. As you get more data, the MCMC standard error should shrink, but the posterior intervals (centered 95% or highest density) should converge.