# How to incorporate a indicator variable on a hierarchical model?

Hello community,

I am looking to model the effects of mesh size and leaf litter species on the decomposition process on streams. These three variables I want to incorporate into my model as indicator variables of group variables given I just have two states for each one, a kind of three ways ANOVA.
So, I am looking to estimate the decomposition rate (k) for each combination of factors (i.e. Stream, mesh-size, and leaflitter species). I am posting here how I started simulating some data, then my base stan model for just the first variable (mesh)
.
I am having issues figuring out the correct way to write the model.

Given the data I had, I will start with 6 group j of the same place as just replicates of same stream, mesh-size, and leaflitter species.
I will start fixing a temperature of the 6 groups with the following model:

log(k) ~ normal (log(c)+ (-0.65(\frac{1}{K_B \times T_j} - \frac{1}{K_B \times 283.15})), \sigma_j)

T is absolute temperature (K), c is a constant (set to 0.01), K_B is the Boltzmann constant.

temp<- c(5, 5,5,5,8,8,8, 8,13,13,13, 13,15,15,15, 15,20,20,20,20,22,22,22,22)+273.15
mesh <- c("fine","coarse", "fine","coarse","fine","coarse","fine", "coarse","fine", "coarse","fine","coarse", "fine","coarse", "fine","coarse","fine","coarse","fine", "coarse","fine", "coarse","fine","coarse")
mesh_integer <- c(0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1) ## dummy variable for mesh-size
plant <- c(1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2)
plant_integer <- c(0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1)
b1 <- -0.0033 ## mesh coefficient factor  to alter k

b2 <- - 0.0025 ## plant species coefficient factor to alter k

KB<-8.61e-5 ## Boltzmann constant
log.k.true<-   log (0.01)+ (-0.65 * ((1/(KB*temp)) - (1/(KB*283.15))))
log.k.true.err<-log.k.true+ rnorm(24, 0,0.25)

exp(log.k.true.err)
time<- c(0,0,0,0,5,5, 5,5, 10,10,10,10,20,20,20, 20,30,30,30,30,45,45,45,45,60,60,60,60,75,75,75, 75,90,90,90, 90,120,120,120, 120)

samp_n<-length(time)

stream<-rep(1:6, each=samp_n)
temp_stream <-rep(c(5,8,13,15,20,22)+273.15, each = samp_n)
Kdf<- data.frame(stream=stream, log.k.true.err=log.k.true.err, time= rep(time), mesh, mesh_integer,plant, plant_integer, temp = temp_stream )



Now, I will use that value of K_j to derive leaflitter mass based on the common use exponential decay model M_{t,j}=e^{-k_jt} (Barlocher & Graça, 2020).

Kdf$kstream<- Kdf$log.k.true.err + rnorm(240,0,0.1) #adding error in log, so multiplicative
Kdf$mass <- exp(-((exp(Kdf$kstream)+ (b1*Kdf$mesh_integer)+ (b2*Kdf$plant_integer) )*Kdf$time)) #Since kstream is logged we need to exponentiate first ggplot (Kdf, aes(x= time, y= mass, fill = stream ))+ geom_point()+ facet_wrap(~ stream )+ labs (title = "Temperature", x= "Days", y= "Remaining Mass")+ theme(legend.position="none") ggplot (Kdf, aes(x= time, y= mass, fill = stream ))+ geom_point()+ facet_wrap(~ mesh)+ labs (title = "Mesh size", x= "Days", y= "Remaining Mass")+ theme(legend.position="none") ggplot (Kdf, aes(x= time, y= mass, fill = stream ))+ geom_point()+ facet_wrap(~ plant)+ labs (title = "plant species", x= "Days", y= "Remaining Mass")+ theme(legend.position="none")  ## The Stan Model #taking of 0 days Kdf_1 <-Kdf[Kdf$time != 0,]


## beta distribution model from Bob Hall, 2022
sink("mult_decomp_beta.stan")
cat("
data {
int<lower=1> N; //number of data points
vector[N] time; // time to recover samples
vector[N] mass; // proportion of recovered mass since experiment begin
real KB; // boltzman constant
int<lower = 1> streamID[N]; // replicates id
int<lower = 1> nstreams; // replicates numbers
vector[N] temp; // mean water temperature
int  <lower=0> Mesh_no;  // number of mesh group
int <lower=0> Mesh[N]; // discreate Mesh_size indicator
//int  <lower=0> plant_no;  // number of mesh group
//int  <lower=0, upper = plant_no> Plant [N]; // discreate Mesh_size indicator

}

parameters {
vector <upper = 0> [nstreams] logk; // log(loss mass rate)
real int_k_T; // normalization constant
real slope_k_T; // equivalent to activation energy
real<lower = 0.1> phi; //  dispersion parameter
real <lower = 0> sigma_stream; // error through replicates
vector <lower=0.0000001> [nstreams] b_0;
vector [Mesh_no] b_1;

}

transformed parameters{

vector <lower=0.0000001> [nstreams] k;
//real [Plant] b_2;
vector <lower=0, upper=1>[N] yhat;
vector <lower=0 > [N] alpha;
vector <lower=0 > [N] B;

// varying k

logk = b_0 + b_1*Mesh;
k = exp(logk);
//b_0 = exp(logk)  + b_1[Mesh];
//for (i in 1:N){
//k=  b_0 + b_1[Mesh];
//}

for (i in 1:N){
yhat[i]= 1 * exp(-(k[streamID[i]]*time[i]));
alpha[i] = yhat[i]*phi;
B[i] = (1-yhat[i])*phi;
}
}
model {
// prior
int_k_T ~ normal(0, 0.1);
slope_k_T~normal(0,1);
sigma_stream~normal (0,2);
//k ~ gamma(0.36, 10);
b_0~ normal(-3.32,-2.81);

//likelihood
for (i in 1:N){
mass[i] ~ beta(alpha[i],B[i]); // likelihood
}
for (j in 1:nstreams){
logk[j]~normal( log(int_k_T) + (slope_k_T * ((1/(KB*temp[j])) - (1/(KB*283.15)))) , sigma_stream);
}

}

",fill=TRUE)

sink()