Getting predictions for multinomial model using brms

I’m trying to fit a choice modeling example that works in a frequentist framework and move it to a Bayesian framework via the brms package. This is an example from a textbook co-authored by @eleafeit:

Background of frequentist approach

Here’s the data:

# Read in data
cbc.df <- read.csv("http://goo.gl/5xQObB",
                   colClasses = c(seat = "factor", price = "factor",
                                  choice="integer"))

# Format
cbc.df$eng <- factor(cbc.df$eng, levels=c("gas", "hyb", "elec"))

I next loaded a package to run a frequentist version of the choice model (mlogit package) and formatted the data for that package:

# Add library
library(mlogit)

# Format data
cbc.mlogit <- mlogit.data(data=cbc.df, choice="choice", shape="long",
                          varying=3:6, alt.levels=paste("pos",1:3), id.var="resp.id")

# Fit model
m1 <- mlogit(choice ~ 0 + seat + cargo + eng + price, data = cbc.mlogit)

So what I’m trying to do now is figure out how to get the preference shares for different combinations of the features seat, cargo, eng, and price. This is the function that is used in the textbook I’m working from:

# Predicting shares
predict.mnl <- function(model, data) {
  # Function for predicting shares from a multinomial logit model 
  # model: mlogit object returned by mlogit()
  # data: a data frame containing the set of designs for which you want to 
  #       predict shares.  Same format at the data used to estimate model. 
  data.model <- model.matrix(update(model$formula, 0 ~ .), data = data)[ , -1]
  utility <- data.model%*%model$coef
  share <- exp(utility)/sum(exp(utility))
  cbind(share, data)
}

So I can take m1 and get predicted vote shares for participants:

# Define combinations of variables
attrib <- list(seat  = c("6", "7", "8"),
               cargo = c("2ft", "3ft"),
               eng   = c("gas", "hyb", "elec"), 
               price = c("30", "35", "40"))

# Select some of those combinations to get predictions for
(new.data <- expand.grid(attrib)[c(8, 1, 3, 41, 49, 26), ]) # find attrib at top

# Generate shares
predict.mnl(m1, new.data)
        share seat cargo  eng price
8  0.11273356    7   2ft  hyb    30
1  0.43336911    6   2ft  gas    30
3  0.31917819    8   2ft  gas    30
41 0.07281396    7   3ft  gas    40
49 0.01669280    6   2ft elec    40
26 0.04521237    7   2ft  hyb    35

So you can see that share sums to 1.0 and that each row predicts the proportion of people that would choose that option if presented with these few combinations.

My Question: How do I get shares if I fit the model in brms?

So I’m pretty sure that it’s straightforward to fit the frequentist version of model m1 in brms:

b1 <- brm (choice ~ 0 + seat + cargo + eng + price, data=cbc.df,
                        family="categorical", chains = 2, cores = 2)

But can anyone help be determine how I can get shares from this model? How can do the equivalent to what’s being accomplished in this function (from above)?:

# Predicting shares
predict.mnl <- function(model, data) {
  # Function for predicting shares from a multinomial logit model 
  # model: mlogit object returned by mlogit()
  # data: a data frame containing the set of designs for which you want to 
  #       predict shares.  Same format at the data used to estimate model. 
  data.model <- model.matrix(update(model$formula, 0 ~ .), data = data)[ , -1]
  utility <- data.model%*%model$coef
  share <- exp(utility)/sum(exp(utility))
  cbind(share, data)
}

Hi Andrew,

I’m not a brms user, so I’m not familiar with the brm output and how you would use that to write a share prediction function. But @Kevin_Van_Horn and I wrote a tutorial showing how to run these models directly in Stan, which you might find useful. We show how to run the same model as
m1 <- mlogit(choice ~ 0 + seat + cargo + eng + price, data = cbc.mlogit)
in this file. In that file, we have a function for computing shares very similar to the one you quote from the book:

shares.mnl.point <- function(beta, # vector of parameters (part-worths) X) { # attribute matrix X for scenario (coded) if (length(beta) != ncol(X)) stop("length of beta doesn't match columns in X") V <- exp(X %*% beta) data.frame(shares=V/sum(V), X) }

One of the nice things you can do in the Bayesian context is compute the shares for each posterior draw of the parameters. In the tutorial, we do this using another function that takes as input the draws for the model parameters (which come directly from Stan) and the matrix of new products (X).

`shares.mnl.post ← function(draws, # matrix of draws (use extract())
X) { # attribute matrix X for scenario
shares ← matrix(NA, nrow=nrow(draws), ncol=nrow(X))
for (draw in 1:nrow(draws)) {
shares[draw,] ← shares.mnl.point(draws[draw,], X)[,1]
}
shares
}’

As you can see above, this function loops over the draws and computes the shares for each draw. This will give you a set of posterior draws for the shares and then you can summarize those draws and use them for to make statements like, “The probability the new product will achieve share greater than 20% is 0.75”. (This is a bit tricker to do in the classical framework, which is why I left it out of the book.) In the tutorial, we compute the posterior quantiles of the shares for each product in the X matrix:

shares.draw <- shares.mnl.post(beta.draws, choc.standata$X[[1]]) apply(shares.draw, 2, quantile, probs=c(0.5, 0.025, 0.975))

Sorry I can’t help you more specifically with the brms output, but I hope this gets you going in the right direction.

You can do this with pretty minimal changes to the function, you just need to change the method for extracting the coefficients

utility <- data.model %*% fixef(model)[,1]

This will extract posterior mean of the parameter estimates, and then use that to construct the estimated probabilities.

However, one of the main draws of Bayesian analysis is the quantification of uncertainty, so you could also want to construct the credibility intervals for each predicted probability. You would do this via:

# Extract coefficients at each iteration
draws = fixef(model, summary=F)

probs = apply(draws,1,
              function(x){
                # Construct individual logits at each iteration
                logits = data.model %*% x
                # Convert logits to probabilities
                exp(logits)/sum(exp(logits))
              })
cbind(
    # Calculate mean probability for each individual
    rowMeans(probs),
    # Calculate 95% intervals for each individual
    t(apply(probs, 1, quantile,probs=c(0.025,0.975))),
    # Prepend to new data
    data
)

Thanks for the response @andrjohns!

One about this point in your response:

I tried updating the predicting shares function with your suggestion:

# Predicting shares
predict.mnl <- function(model, data) {
  # Function for predicting shares from a multinomial logit model 
  # model: mlogit object returned by mlogit()
  # data: a data frame containing the set of designs for which you want to 
  #       predict shares.  Same format at the data used to estimate model. 
  data.model <- model.matrix(update(model$formula, 0 ~ .), data = data)[ , -1]
  utility <- data.model %*% fixef(model)[,1]
  share <- exp(utility)/sum(exp(utility))
  cbind(share, data)
}

How I get the following error:
Error in terms.default(object, data = data) : no terms component nor attribute

Can you immediately see some other adjustment that I might need for the function?

Thanks so much for your response @eleafeit!

While I haven’t fully dug into the complexity of using STAN directly, your tutorials might be enough to nudge me over the edge. Thanks for sharing!

One of the biggest hurdles might be that I’m going to try and fit my choice model and apply MRP to adjust for some unavoidable differences between my sample and the population. Figuring out how to extend your STAN code to MRP might be a big ask but I might be able to pull it off :)

Even if I can’t do full out MRP, do you have any advice on extending this code so I can include covariates as predictors?

Thanks again for your time!

Wow, I’d like to see a choice model with MRP. Kepp me posted.

You wrote:

Even if I can’t do full out MRP, do you have any advice on extending this code so I can include covariates as predictors?

I’m not sure what you mean by “covariates”. Do you want to incorporate features of the decision makers? You might be looking for this chapter of the tutorial:

Will keep you posted. It might be a fool’s errand, but it would be handy, wouldn’t it? :)

Exactly, I mean features of the decision maker like age, sex, etc. Didn’t realize that chapter tutorial had info on how to include non-attribute predictors into the model. Will check it out!

Oh right, forgot about the formula. Because brms works in a very general manner, you can’t (AFAIK) extract the model formula in a way that’s compatible with model.matrix. Instead you’ll have to provide it when you call the function. Try:

# Predicting shares
predict.mnl <- function(model, data, formula_string) {
  # Function for predicting shares from a multinomial logit model 
  # model: mlogit object returned by mlogit()
  # data: a data frame containing the set of designs for which you want to 
  #       predict shares.  Same format at the data used to estimate model. 
  data.model <- model.matrix(as.formula(formula_string), data = data)
  utility <- data.model %*% fixef(model)[,1]
  share <- exp(utility)/sum(exp(utility))
  cbind(share, data)
}

Which you would then call using:

predict.mnl(b1, new.data, "~ 0 + .")

Where ~ 0 + . indicates that you want to include all variables in the dataset as predictors, and you don’t want the intercept