Help implementing Plackett-Luce model in Stan with ties, ranked predictors, and rater covariates

Zach · April 30, 2025, 2:31pm

Hi all,

I’m working on implementing a Plackett-Luce model in Stan and have run into a few issues, particularly around modeling ties in the rankings and including multiple ranked predictors and covariates for raters and participants.

I’ve used the following resources to get me this far:

github.com/bob-carpenter/case-studies

sushi-rating/plackett-luce.stan

master

functions {
  real plackett_luce_lpmf(array[] int y, vector beta) {
    vector[size(y)] beta_y = beta[y];
    return sum(log(beta_y ./ cumulative_sum(beta_y)));
  }
}
data {
  int<lower=1> I;                       // # items
  int<lower=1> K;                       // # items ranked per rater
  int<lower=1> R;                       // # raters
  array[R, K] int<lower=1, upper=I> y;  // rankings (y[r, 1] > y[r, 2] > ...)
}
parameters {
  simplex[I] beta;                      // item quality
}
model {
  for (r in 1:R) {
    y[r] ~ plackett_luce(beta);
  }
}

What I’m trying to do:

Each rater ranks a subset of participants on a set of traits (e.g., predictor_trait1, predictor_trait2, …, trait_DV).
The outcome is a ranking (trait_DV), and I want to model it as a function of rankings on other traits (predictor_trait1, etc.).
I also have rater-level covariates (e.g., rater_gender) and participant-level covariates (e.g., participant_gender) that I’d like to include as fixed effects, and optionally their interaction.
Some raters tied participants on some traits, meaning they assigned the same rank to multiple participants.

Data format:

The data is currently in long format like this:

rater_id	participant_id	trait	rank	rater_gender	participant_gender
r1	p1	predictor_trait1	1	male	female
r1	p2	predictor_trait1	2	male	male
r1	p3	predictor_trait1	2	male	female
r1	p1	trait_DV	1	male	female
…	…	…	…	…	…

# toy data
data <- tibble::tibble(
  participant_id = c(101, 102, 103, 104, 101, 102, 103, 104),
  trait          = c("predictor_trait1_IV", "predictor_trait1_IV", "predictor_trait1_IV", "predictor_trait1_IV",
                     "trait_DV", "trait_DV", "trait_DV", "trait_DV"),
  rater_id       = c(1, 1, 1, 1, 1, 1, 1, 1),
  rater_group    = c(1, 1, 1, 1, 1, 1, 1, 1),
  participant_group = c(0, 0, 1, 1, 0, 0, 1, 1),
  rank           = c(1, 2, 2, 4, 1, 3, 2, 4)  # ties allowed
)

What I’m not sure about:

Ties: Some raters assign the same rank to multiple participants. Can the Plackett-Luce model in Stan handle this natively? If not, is there a common workaround?
Predictors: I want to model trait_DV rankings as a function of the other ranked traits — all on the ranking scale. Does multiplying each trait’s rank by a corresponding beta coefficient make sense, or is there a better way to structure that part of the model?
General model structure: Does the following Stan structure make conceptual sense? I’m currently modeling the outcome as a categorical rank using trait-based rankings as predictors. I’d like to know if this approach reasonably approximates a Plackett-Luce model or if I should move toward a full ranking-based likelihood with support for ties.

Example model:

data {
  int<lower=1> n_obs;
  int<lower=1> n_traits;
  int<lower=1> n_raters;

  int<lower=1> trait_DV_rank[n_obs];  // outcome: ranking of DV trait
  real rank[n_obs];                   // predictors: ranks of other traits
  int<lower=1> trait_id[n_obs];       // which predictor trait this row is
  int<lower=1> rater_id[n_obs];       // rater identity
  int<lower=0, upper=1> rater_gender[n_obs];
  int<lower=0, upper=1> participant_gender[n_obs];
}

parameters {
  real alpha;
  vector[n_traits] beta;
  real beta_rater_gender;
  real beta_participant_gender;
  real beta_interaction;

  vector[n_raters] rater_intercepts;
  real<lower=0> sigma_rater;
}

model {
  vector[n_obs] eta;

  beta ~ normal(0, 1);
  alpha ~ normal(0, 1);
  beta_rater_gender ~ normal(0, 1);
  beta_participant_gender ~ normal(0, 1);
  beta_interaction ~ normal(0, 1);
  rater_intercepts ~ normal(0, sigma_rater);
  sigma_rater ~ cauchy(0, 1);

  for (i in 1:n_obs) {
    eta[i] = alpha +
             beta[trait_id[i]] * rank[i] +
             beta_rater_gender * rater_gender[i] +
             beta_participant_gender * participant_gender[i] +
             beta_interaction * rater_gender[i] * participant_gender[i] +
             rater_intercepts[rater_id[i]];
  }

  trait_DV_rank ~ categorical_logit(eta);  // Likelihood
}

Bob_Carpenter · May 2, 2025, 9:34pm

I think it makes sense when thinking about Plackett-Luce to back up and think about Bradley-Terry. If you can figure out what to do for ties there, then it should be easy to promote that from pairwise to K-wise rankings. There’s a literature on how to do this that I haven’t read.

One way to think about ties is to treat them as possibly coming out either way. So if you have [a = b, c], then this could be [a, b, c] or [b, a, c].

I don’t know how you could model an interaction between item-level (i.e., participant trait level) and rater-level covariates.

You mean the rank among the other participants? I would think that you would instead try to create a generative model for each item.

P.S. Weird looking at my face in a post!

Topic		Replies	Views
Bayesian Hierarchical Latent Mixture Model Modeling	2	364	June 29, 2023
Rank order modeling in stan, with some complications Modeling cognitive-science	2	1156	May 20, 2021
Thurstonian model Modeling	14	3379	August 27, 2019
Help with Bayesian Modelling Modeling rstan , prior-choice , priors , initialization	6	277	July 4, 2024
Improve performance modelling subset of observations Modeling specification , performance	2	347	January 21, 2021

Help implementing Plackett-Luce model in Stan with ties, ranked predictors, and rater covariates

What I’m trying to do:

Data format:

What I’m not sure about:

Related topics