Comparison of Treatment to Control for small sample size

Hello Stan Community!
I have two sets of data, one from the survey and one from a new instrument whose accuracy must be determined. The number of observations is very small (n=5). What is the best way to determine whether the data from the instrument is similar to that of survey with such small data using bayesian framework? (Somewhat similar to Dunnett’s test of frequentist statistics)
Thank you in advance!

Can you describe the data a bit more? You mention a survey; do you mean survey in the geographical sense of a single measurement at a single location, or in the polling sense of multiple questions and multiple-choice answers?

2 Likes

Thank you for replying @mike-lawrence
The data is about the degree of horizontal curve of road sections. We first measured it by performing a detailed field survey using laser technology, which is considered as an accurate value of the degree of curve. After that, we measured the degree of curve using new instrument, which is attached to a vehicle and is much faster and cheaper. We need to find out if this new instrument gives satisfactory value compared to the laser survey.
Both the survey and instrument data were collected in 4 road sections and each section has a single value of degree of curve.
Here is my data:

Road_data.csv (87 Bytes)

I apologize that I didn’t make it clear in the beginning.
Thanks!

2 Likes

Ah, then in that case there’s a few different approaches you could take, but with only 4 observations you’re probably not going to be able to distinguish among the options very well in terms of model performance. The simplest model would be one where you convert your observations to a “% error” value at each location. This model would implicitly express the assumption that as the curve magnitude increases, the absolute magnitude of error also increases, which seems to be supported by your data (at least, the largest true value, 23.3, has the largest error by an order of magnitude). I suspect that the “% error” scale might be easier for you to think of reasonable priors as well? Assuming also that errors are unbiased (i.e. that the device doesn’t have any consistent under-estimation or over-estimation), in stan you would code this as:

data{
  int N ; // number of observations
  vector[N] obs_div_true ; //vector of observed values divided by true values
}
parameters{
  real<lower=0> sigma ; //scale of measurement noise
}
model{
  sigma ~ ... ; //your prior on the scale of measurement noise here
  obs_div_true ~ normal(0,sigma) ;
}

Now, your data seem to have errors that are all under-estimates, in which case you might not want to assume unbiased measurement and add a bias term:

data{
  int N ; // number of observations
  vector[N] obs_div_true ; //vector of observed values divided by true values
}
parameters{
  real mu : //measurement bias
  real<lower=0> sigma ; //scale of measurement noise
}
model{
  mu ~ ... // your prior for measurement bias here
  sigma ~ ... ; //your prior on the scale of measurement noise here
  obs_div_true ~ normal(mu,sigma) ;
}

But note that with only 4 data points, you shouldn’t expect the posterior to differ too dramatically from the priors you specify.

2 Likes

Thanks @mike-lawrence, I am grateful to your efforts.
One last question, how do I draw a conclusion that the instrument is providing satisfactory results or not? Is it by looking at the 2.5% and 97.5% values of sigma and that if it contains zero in between or not?
I am new to stan and Bayesian statistics, I am sorry if I am asking a stupid question.
Thank You!

The first model allows you to do inference on the scale of measurement errors, in which zero is a possible value but one that lies at the edge of the parameter space, so it is very unlikely to have a posterior quantile interval that actually includes it. More typical for that scenario is to define at the outset an upper bound on the range of tolerable scale values, then see where the posterior samples for sigma fall with respect to this upper bound.

If it’s too difficult to come up with an upper bound for the scale parameter (being a standard deviation), you can do posterior predictive inference, generating new simulated “observations” from the posterior, yielding values on the % error scale, where you can then ask “what proportion of observations are above X% error”, where X is a number you choose as your upper bound for tolerable % error for a given future observation. To do this latter, you would do:

data{
  int N ; //number of observations
  vector[N] obs_div_true ; //vector of observed values divided by true values
  real max_tolerable_error ;
  int num_post_replicates ; // make this something big like 1e4
}
parameters{
  real<lower=0> sigma ; //scale of measurement noise
}
model{
  sigma ~ weibull(2,1) ; //your prior on the scale of measurement noise here
  obs_div_true ~ normal(0,sigma) ;
}
generated quantities{
  real prop_pred_above_criterion ;
  { //local env to avoid saving intermediate values
	vector[num_post_replicates] above_criterion ;
        // surely a more efficient way to do this loop!!
	for(i in 1:num_post_replicates){
		if(abs(normal_rng(0,sigma))>max_tolerable_error){
			above_criterion[i] = 1 ;
		}else{
			above_criterion[i] = 0 ;
		}
    }
    prop_pred_above_criterion = mean(above_criterion) ;
  }
}
2 Likes

I will work with both models and try to find a reasonable tolerable error. Thank you @mike-lawrence, you are a wonderful person.

1 Like