ANOVA with varying effects

Hi,

I assessed the ecological condition (EC) of rivers located inside and outside of six estates (1,…6). EC values are between 0 and 10. These estates are under three management regimes (1,2,3). I also measured EC on two least disturbed areas for reference.

My goal is to compare EC values in estates under different management regimes to determine which regime deliver the best results. I’m also interested in comparing EC under these three regimes with EC on least disturbed areas.

I decided to start with a simple ANOVA type model.

data {
  int<lower=0> N;
   int<lower=0> N_regimes;
  vector[N] EC;
  int<lower=1,upper=N_regimes> regimes[N];
}
parameters {
real<lower=0> sigma;
vector[N_regimes] mu;
}
model {
EC~normal(mu[regimes],sigma);
sigma ~ normal(0,2);  
mu~normal(5,2);
}
generated quantities{
real mu_42= mu[4]-mu[2];
real mu_43= mu[4]-mu[3];
real mu_41= mu[4]-mu[1];
real mu_32= mu[3]-mu[2];
real mu_31= mu[3]-mu[1];
real mu_21= mu[2]-mu[1];    
}

Now I need to improve it. On each estate (both inside and outside) I obtained several EC samples. EC samples inside the same estate are likely to be more similar to each other, therefore I need to incorporate this dependence in the model. I guess I need to add hierarchical priors to account for this dependence, but I’m not sure how I should do it. I have an additional covariate called “Estate” which identifies the location where each EC value was obtained that seems to be a good candidate for inclusion as a varying effect.

Thanks.

Take a look here for lectures walking through various hierarchical models.

Oh, and I think you might get led astray if you think of your model, even the non-hierarchical one you include here, as “ANOVA”. You could in theory compute an F statistic in the generated quantities, but the model itself is not wed to the term “ANOVA”.

I’d recommend trying to fit this with rstanarm or brms instead of directly with Stan. The rstanarm package has a useful vignette for this, and it’s probably the easiest place to start.

Thanks @mike-lawrence @Christopher-Peterson. Before I saw your replies I found a possible solution here: http://www.flutterbys.com.au/stats/tut/tut9.2b.html#h2_23 . I modified my code accordingly and it seems to do what I want it to do.

data {
  int<lower=0> N;
  int<lower=0> N_regime;
  int<lower=0> N_estate;
  vector[N] svap;
  int<lower=1,upper=N_estate> estate_id[N];
  int<lower=1,upper=N_regime> regime_id[N];
}
parameters {
real<lower=0> sigma;
vector[N_regime] mu_r;
vector[N_estate] mu_e;
}

transformed parameters{
vector[N] mu;
mu= mu_r[regime_id]+mu_e[estate_id];
}

model {
  
  //priors
   mu_r~normal(5,2);
   mu_e~normal(5,2);
   sigma~normal(0,2);
   
   //Likelihood
   svap ~ normal( mu , sigma );
   
}
generated quantities{
real svap_pred[N] = normal_rng(mu,sigma);

real mu_r4_3 = mu_r[4] -mu_r[3];
real mu_r3_2 = mu_r[3] -mu_r[2];
real mu_r3_1 = mu_r[3] -mu_r[1];
real mu_r2_1 = mu_r[3] -mu_r[1];

}