# Interpreting multinomial logit coefficients

Hi all,

I’m wondering how to interpret the coefficients of a multinomial logit regression and if there is a better way to parameterize the model to make it more interpretable. I have a 4-category outcome variable. I have three predictors: a dichotomous gender indicator (1=male), a 5-category variable that defines ranges for the number of books in the home, and a continuous scale of motivation to read. The code follows and runs just fine with no problems.

``````# Read in relevant packages #

library(rstan)
library(bayesplot)
library(magrittr)
library(dplyr)
library(loo)
options(mc.cores = 4)

# male:  (1=male)

# booksHome: # of books in home 1) 0-10, 2) 11-25, 3) 26-100, 4) 101-200 5) More than 200

# ASBGSMR:  Students motivated to read (4pt, strongly agree to strongly disagree)

# ASBR07A:  I do well in reading  (4pt, strongly agree to strongly disagree)

# Some recoding to reduce the sample size, rename variables, remove missing data (listwise) and
# recode sex

select(ASBR07A, Male, ASBG04,ASBGSMR) %>%
rename(DOWELL = ASBR07A,
Male=Male,
booksHome = ASBG04,

DOWELL <- as.factor(DOWELL)

f <- as.formula("DOWELL ~ Male + booksHome + motivRead")
data.list <- list(n=nrow(m),
d=ncol(m),
x=m,

modelString <- "
data {
int<lower = 2> k;
int<lower = 0> n;
int<lower = 1> d;
int <lower=1, upper=k> DOWELL[n];
matrix[n, d] x;
}
parameters {
matrix[d, k] beta;
}
transformed parameters {
matrix[n, k] x_beta= x * beta;
}
model {
to_vector(beta) ~ normal (0,2);
for (i in 1:n) {
DOWELL[i] ~ categorical_logit(x_beta[i]');
}
}
generated quantities {
int <lower=1, upper=k> DOWELL_rep[n];
vector[n] log_lik;
for (i in 1:n) {
DOWELL_rep[i] = categorical_logit_rng(x_beta[i]');
log_lik[i] = categorical_logit_lpmf(DOWELL[i] |x_beta[i]');
}
}
"
# Start estimation
nChains = 4
nIter= 10000
thinSteps = 10
burnInSteps = floor(nIter/2)

DOWELL = data.list\$DOWELL

MultiNomRegFit = stan(data=data.list,model_code=modelString,
iter=nIter,warmup=burnInSteps,thin=thinSteps)

# Stan plots

stan_plot(MultiNomRegFit, pars=c("beta"))
stan_trace(MultiNomRegFit,inc_warmup=T, pars=c("beta"))
stan_dens(MultiNomRegFit, pars=c("beta"))
stan_ac(MultiNomRegFit, pars=c("beta"))

# Stan output

print(MultiNomRegFit,pars=c("beta"))

# Posterior predictive checks

DOWELL_rep = rstan::extract(MultiNomRegFit,"DOWELL_rep ")\$DOWELL_rep
ppc_bars(y=DOWELL, yrep= DOWELL_rep)

log_lik1 = loo::extract_log_lik(MultiNomRegFit, merge_chains = FALSE)
loo_catLogit_1 = loo(log_lik1)
print(loo_catLogit_1)
``````

The output is as follows

``````Parameter Mean SEMean SD 95% PPI ESS rhat
beta[1,1]  -2.14 0.02 1.04 -4.16– -0.11 1867 1
beta[1,2]  1.09 0.02 1.04 -0.96– 3.16 1836 1
beta[1,3]  0.98 0.02 1.04 -1.08 – 3.09 1883 1
beta[1,4]  -0.07 0.02 1.06 -2.11 – 1.98 1865 1
beta[2,1]  0.54 0.03 1.04 -1.46 – 2.51 1665 1
beta[2,2]  0.02 0.03 1.04 -1.96 – 2.02 1706 1
beta[2,3] -0.28 0.03 1.05 -2.28 – 1.73 1700 1
beta[2,4] -0.23 0.03 1.05 -2.24 – 1.78 1670 1
beta[3,1] 0.19 0.02 0.99 -1.72 – 2.11 1880 1
beta[3,2] -0.04 0.02 0.99 -1.99 – 1.88 1873 1
beta[3,3] -0.05 0.02 0.99 -1.96 – 1.88 1896 1
beta[3,4] -0.38 0.02 0.99 -2.30 – 1.51 1865 1
beta[4,1] 0.17 0.02 1.01 -1.80 – 2.15 1774 1
beta[4,2] -0.10 0.02 1.01 -2.06 – 1.88 1767 1
beta[4,3] -0.12 0.02 1.01 -2.12 – 1.85 1770 1
beta[4,4] 0.10 0.02 1.01 -1.85 – 2.06 1773 1
``````

I understand the meaning of betas in multinomial logit regression, but in this case I am unsure what the betas indices (e.g. beta[1,1] and so on) are referring to. Is there a better way to parameterize this model so that the betas are more meaningful.

Thanks

David

First, I’d think about rearranging `x` and `beta` so you don’t have to transpose in the density. It’d be even better to collect an array of the `x_beta[i]` and of `DOWELL` and then vectorize the whole thing. But that’s just efficiency, not interpretability.

It sounds like you have discrete predictors, but you seem to be using something like a continuous encoding of them in `x`. The usual way to code your predictors for regression would be this way:

``````data {
int<lower=0> N;  // observations
// predictors
array[N] int<lower=0, upper=1> male;  // assuming 1 if male, 0 if female
array[N] int<lower= 1, upper=5> books;
vector[N] motivation;
``````

and then the parameters look like this:

``````parameters {
vector[4] alpha;
vector[4] beta_sex;
array[5] vector[4] beta_books;
vector[4] beta_motivation;
array[N] int<lower=1, upper=4> dowell;
``````

and then the likelihood looks like this:

``````for (n in 1:N)
dowell[n] ~ categorical_logit(alpha + beta_sex * sex[n] + beta_books[books[n]] + beta_motivation * x[n]);
``````

There are lots of choices here for parameterizations when you get into multiple outcomes. For more efficiency, you can pack the first three terms into a matrix indexed by sex and books.

Now the `alpha` is an intercept, `beta_sex` the male effect (female rolled into intercept), and `beta_books[j]` the factors for having j units of books, and so on.

The problem is that `categorical_logit` is only identified up to an additive constant. You can handle this by pinning one of the categories to 0 and having everything else be relative, or you can let the priors sort it out. Or you can set the coefficients for the random effects so that they sum to 0 and everything else gets pushed into the intercept.

1 Like

Thanks, Bob,

So, I’m trying reconcile the code you sent me with the discussion in the Stan Users guide
1.6 Multi-logit regression | Stan User’s Guide on identifiability in the multinomial logit regression. The code I’m trying is this

``````modelString <- "
data {
int<lower = 2> k;
int<lower = 0> n;
int<lower = 1> d;
int <lower=1, upper=k> DOWELL[n];
matrix[n, d] x;
}
transformed data {
row_vector[d] zeros = rep_row_vector(0, d);
}
parameters {
matrix[k - 1, d] beta_raw;
}
transformed parameters {
matrix[k, d] beta;
beta = append_row(beta_raw, zeros);
}
model {
to_vector(beta) ~ normal (0,2);
for (i in 1:n) {
DOWELL[i] ~ categorical_logit(x_beta[i]');
}
}

generated quantities {
int <lower=1, upper=k> DOWELL_rep[n];
vector[n] log_lik;
for (i in 1:n) {
DOWELL_rep[i] = categorical_logit_rng(x_beta[i]');
log_lik[i] = categorical_logit_lpmf(DOWELL[i] |x_beta[i]');
}
}
"
``````

The argument for the `categorical_logit` call is incorrect, an throwing an error message that x_beta[i]’ doesn’t exist. I’m not sure what I put into the argument.

David