# Coding an interaction term between a categorical variable and a continuous variable in Stan, where the continuous variable is a vector of continuous data

Hello, I am new to Stan and I am trying to model the relationship between fish length and scale size using a measurement error model that accounts for individual-level variation in fish scale size (individual-level variation was measured externally and included in the model as “tau”). In this model I also include a random year intercept, and a categorical effect of fish age. I was able to run this model without interactions, but I am struggling to get the syntax right to code an interaction between the true average scale radius estimated by the model and the categorical variable of fish age. In particular, Stan seems to be having an issue with the fact that “x” (the estimated true mean scale radius) is a vector when I attempt to code the interaction between “x” and fish age.

I would like to know if a) the code below is correctly specifying a continuous X categorical interaction, and if so, b) how I can work around “x” being a vector (whether that requires transforming “x” or using a different coding approach…).

Thanks so much for your time!

``````
data {
int<lower=0> N;                  // number of cases
int j;                           // number of years
int k;                           // number of levels of ocean age
vector[N] fl;                    // log fish forklength
int o_age[N];                    // ocean age (categorical)
int year[N];                     // year ID (categorical)
real<lower=0> tau;               // measurement noise
}
parameters {
vector[N] x;             // unknown true value to be estimated
vector[j] u;             // year intercepts
vector[k] b;             // ocean age effects
vector[k] b2;            // scale radius X ocean age interaction effects
real mu_x;               // prior location
real<lower=0> sigma_x;   // prior scale
real alpha;              // intercept
real beta;               // slope
real<lower=0> sigma;     // outcome noise
real<lower=0> sigma_u;   // year noise
}
model {
x ~ normal(mu_x, sigma_x);     // prior
rad_scale ~ normal(x, tau);    // measured scale radius is a function of the true average scale radius with *tau* variation
fl ~ normal(beta * x + u[year] + b[o_age] + b2[x, o_age], sigma);  // Stan indicates there is a model misspecification here, specifically for the term "b2[x, o_age]"
alpha ~ normal(0, 5);
beta ~ normal(0, 5);
b ~ normal(0, 5);
b2 ~ normal(0, 5);
sigma ~ cauchy(0, 5);
mu_x ~ normal(0,1);   // prior for  estimated true average scale radius
sigma_x ~ cauchy(0,0.1);   // prior for  estimated true average scale radius error
sigma_u ~ cauchy(0, 5);      // prior for random year intercept error
u ~ normal(alpha, sigma_u);        //prior for year random intercept effects
}
generated quantities {
vector[N] nuY;
vector[N] nuX;
for(i in 1:N){
nuY[i] = normal_rng(beta * x[i] + u[year[i]] + b[o_age[i]] + b2[x[i], o_age[i]], sigma);
nuX[i] = normal_rng(x[i],sigma_x);
}

}

``````
1 Like

Very sorry your message fell through :)

Whenever I’m hesitant about raw Stan code I usually write my model in a more high-level language (e.g., `brms` or `rethinking`) and then look at the generated Stan code to see if I’ve understood things correctly. In `brms` you simply call `stancode()` to get the model specification.

Have you tried that?

1 Like

No worries. I got some help managed to get the model to compile and it is currently running. I am waiting to see if it works before posting the code. I learned that Stan does not like to multiply 2 vectors that are oriented the same (i.e., horizontal x horizontal or vertical x vertical). Sometimes all that is needed to make the model run is an apostrophe to transpose a vector.

2 Likes

If it’s of any help: you can do element-wise multiplication of two column vectors (or two row vecters) by using `u .* v`.

3 Likes