Post on ranked random coefficients logit

I put together a post on ranked choice random coefficients logit, which comes up a lot in market research. Filing it here to make it searchable.



Thanks for sharing. Can this be used for MaxDiff? I use Sawtooth and R right now, but will read your post in greater detail and comeback with questions.


Yeah, you could implement Sawtooth’s version of MaxDiff by modifying the best choice code. Remember that softmax(V) describes the probability each element of the vector U = V + e (where e is Gumbel distributed) is the maximum. If we want the probability that an element of U is the worst, then you might think that we could take the softmax of the negative of V, but that’s actually wrong, because e is not symmetrically distributed. In the Sawtooth whitepaper they say that it “works well in practice” but I’d be cautious. This little code snippet will show you how wrong it can be.

rgumbel <- function(n, mu = 0, beta = 1) mu - beta * log(-log(runif(n)))
V <- rnorm(10)

n_rep <- 10000
U <- sapply(1:n_rep, function(i) V + rgumbel(10))
max_choice <- apply(U, 2, which.max)
min_choice <- apply(U, 2, which.min)
max_proportions <- unlist(lapply(1:10, function(i) sum(max_choice == i)))/n_rep
min_proportions <- unlist(lapply(1:10, function(i) sum(min_choice == i)))/n_rep

softmax <- function(x) exp(x)/sum(exp(x))

# Maximum probabilities 
plot(as.numeric(max_proportions), softmax(V))
abline(0, 1)

# The minimum probabilities given by softmax(-V) versus the (more correct) simulated minimums.
plot(as.numeric(min_proportions), softmax(-V))
abline(0, 1)

In any case, to implement the Sawtooth version of MaxDiff, which you shouldn’t, you can do the following. First, add another dummy vector for worst_choice describing worst choices in each task, then modify the model chunk:

model {
  // create a temporary holding vector
  vector[N] log_prob;
  vector[N] log_prob_worst;
  // priors on the parameters
  tau ~ normal(0, .5);
  beta ~ normal(0, .5);
  to_vector(z) ~ normal(0, 1);
  L_Omega ~ lkj_corr_cholesky(4);
  to_vector(Gamma) ~ normal(0, 1);
  // log probabilities of each choice in the dataset
  for(t in 1:T) {
    vector[K] utilities; // tmp vector holding the utilities for the task/individual combination

    // add utility from product attributes with individual part-worths/marginal utilities
    utilities = X[start[t]:end[t]]*beta_individual[task_individual[t]]';
    log_prob[start[t]:end[t]] = log_softmax(utilities);
    log_prob_worst[start[t]:end[t]] = log_softmax(-utilities);
  // use the likelihood derivation on slide 29
  target += log_prob' * choice;
  // increase the log probability for worst choices
  target += log_prob_worst' * worst_choice;


thanks a lot for the detailed response. I will examine and get back with any questions.