Does an individual-specific intercept adequately capture the effect of the reference category for a categorical predictor?

I’m running a simple linear model that has individual-specific intercepts and a categorical variable capturing the effect of the day of the week.

data {
  int<lower=1> N; // number of observations
  int<lower=1> K; // number of weekdays - 1
  matrix[N,K]  X; // design matrix including weekdays (except from the reference category Tuesday)
  vector[N]    y; // outcome
  int<lower=0> id[N];  // identifier of individuals
  int<lower = 1> H; // number of individuals
}
parameters {
  real alphai[H];  
  vector[K] beta;
  real<lower=0> sigma;
}
model {
  for (n in 1:N){
    y[n] ~ normal(alphai[id[n]] + X[n]*beta, sigma_y);
  }
  // priors
}

I construct the design matrix X using the following code:

X <- model.matrix(outcome ~ weekday, data = da)[,-1]

Thus, I do not have an (additional) intercept in the predictor design matrix, and it looks like this (6 columns with dummary variables for all weekdays except from Tuesday, which is the reference category):

When I now make predictions from the posterior output of the model, the prediction for the reference category (in which I set all predictors included in X to zero) looks strange:
grafik

I get the predictions using (ex post in R):

tuesday = c(0,0,0,0,0,0)
y_tuesday = mean(alphai) + tuesday  %*% beta
monday = c(0,0,0,1,0,0)
y_monday = mean(alphai) + monday  %*% beta
etc.

My questions are:

  1. Do I need to add a global intercept to the model (capturing the effect of the reference category) to make correct predictions from the model for the reference category? The mean of the individual-specific intercepts may not adequately reflect the effect of the reference category?
  2. With my model only including the individual-specific intercept alphai, can I somehow get correct predictions for my reference category?

I checked that Tuesdays should not be so much different from other days summarizing the mean of my outcome of interest for different weekdays and running simple non-Bayesian models, so something is indeed wrong with my predictions.

Thanks in advance.

I think what this model is doing is effectively giving a different effect for Tuesday on your outcome for every individual. If you think of a basic model without an intercept for each id and just a single parameter alpha, then alpha as the Intercept, would be the coefficient for Tuesday. Since you have an intercept for every id in the model, I think you are parameterizing the model to have a different estimate for the effect of Tuesday for each person on your outcome, while all the other days have the same effect across individuals for your outcome. It seems like you should probably include Tuesday in the design matrix if you want to have a different intercept for every person and then additive effects for each day of the week. Note that this model doesn’t have any partial pooling for these intercepts. Did you mean to do that? Just checking, since I haven’t seen a model like this.

1 Like

Thank you, this makes totally sense. So the conclusion would be that I cannot isolate the effects of the weekdays with my current model since they are captured with the same parameters as the individual-specific intercept.

Regarding the partial pooling, I was unsure whether I need to do it. Imagine I now inlcude Tuesday in the design matrix (so it acts as an intercept), and then I use partial pooling for the individual-specific intercepts:

data {
  int<lower=1> N; // number of observations
  int<lower=1> K; // number of weekdays (including the reference category)
  matrix[N,K]  X; // design matrix including weekdays (including the reference category Tuesday)
  vector[N]    y; // outcome
  int<lower=0> id[N];  // identifier of individuals
  int<lower = 1> H; // number of individuals
}
parameters {
  real alpha0;
  real<lower = 0> sigma_alpha;
  vector[H] alphai;

  vector[K] beta;
  real<lower=0> sigma;
}
model {

  for (h in 1:H){
    alphai[h] ~ normal(alpha0, sigma_alpha);
  }
  
  alpha0 ~ normal(0, 1);
  sigma_alpha ~ normal(0, 1);

  for (n in 1:N){
    y[n] ~ normal(alphai[id[n]] + X[n]*beta, sigma_y);
  }
  

Wouldn’t then the parameter capturing the effect of Tuesday (let’s say beta[1]) and the mean of the individual-specific intercepts (alpha0) capture the same thing?

I think so. If that’s how you want the model, that’s ok, but I don’t think that you are going to get the Tuesday predictions that you think you get by averaging over them. I was specifically trying to address your predictions using the coefficients.

So let’s say you have (writing it out to make it explicit instead of using the X matrix):

y[n] ~ normal(alphai[id[n]] + mon[n]*beta1 + tues[n]*beta2 + wed[n]*beta3 + thurs[n]*beta4 + fri[n]*beta5 + sat[n]*beta6 + sun[n]*beta7, sigma_y);

Then the intercepts by the themselves have no true meaning, as you have a dummy variable for every day of the week. But, I think you would get predictions closer to what you want using your method of predicting with coefficients, because then each intercept represents the baseline outcome with each day of the week as an offset from that. If you want the intercept to have meaning, then you can make an intercept that is the reference level Tuesday:

y[n] ~ normal(alpha + alphai[id[n]] + mon[n]*beta1 + wed[n]*beta2 + thurs[n]*beta3 + fri[n]*beta4 + sat[n]*beta5 + sun[n]*beta6, sigma_y);

And then alpha is Tuesday and alpha[id] are the vector of offsets.
I don’t think averaging over the estimated effects of Tuesday (your individual intercepts) like you do in y_tuesday = mean(alphai) will get you the average Tuesday that you are looking for…

But try them out for yourself and make the predictions. I could be missing something.

1 Like

Thank you so much, this is very useful.