Projpred: modify labels in plot.vsel

Hi everyone,

Back again with another query on projpred - this time about modifying the plots produced by plot(<cv_proportions>), plot(<ranking>), and plot(<vsel>).

I would like to have more control over the plots, for example by modifying axis labels or titles. At the moment, axis labels are taken directly from the reference model and should be altered for presentation/publication purposes. This paper has done so, but I can’t replicate the code provided.

Using the solution for bayesplots such as described here doesn’t seem to work for these plots.

Thanks for any suggestions!

Here’s an example of modifying the plot that’s created in the Examples in the plot.vsel() documentation.

library(projpred)
dat_gauss <- data.frame(y = df_gaussian$y, df_gaussian$x)

# The `stanreg` fit which will be used as the reference model (with small
# values for `chains` and `iter`, but only for technical reasons in this
# example; this is not recommended in general):
fit <- rstanarm::stan_glm(
  y ~ X1 + X2 + X3 + X4 + X5, family = gaussian(), data = dat_gauss,
  QR = TRUE, chains = 2, iter = 500, refresh = 0, seed = 9876
)

# Run varsel() (here without cross-validation, with L1 search, and with small
# values for `nterms_max` and `nclusters_pred`, but only for the sake of
# speed in this example; this is not recommended in general):
vs <- varsel(fit, method = "L1", nterms_max = 3, nclusters_pred = 10,
             seed = 5555)
plot(vs)

That makes this plot, which I haven’t customized yet:

You can then use some functions from the ggplot2 package to modify the overall title, axis titles, the tick mark labels, etc.:

library(ggplot2)
p <- plot(vs)
p + 
  labs(
    # change overal plot titles and x and y axis titles/labels 
    title = "My title",
    subtitle = "My subtitle",
    x = "My x-axis",
    y = "My y-axis"
  ) + 
  scale_x_continuous(
    # change x-axis tick mark labels 
    labels = c("A", "B", "C", "D")
  )

That modifies the plot to this:

I think the same thing should work for the other plots too. If you need to modify y axis tick mark labels use scale_y_continuous. And that labels vector supplied to scale_x_continuous or scale_y_continuous needs to be the same length as the number of tick marks (“breaks”), which might be different in the other plots.

Does that work for your use case?

1 Like

Thanks for your quick response.

My original plot shown below, not modified. Note that we have 9 breaks: intercept plus eight predictors.

Now we use the ggplot functions to customise the tick mark labels, also with nine labels.
plot + ggplot2::scale_x_continuous(labels = c("(Intercept)", "Trust", "Voice", "Transparency", "Interpersonal treatment", "Accountability", "Decision control", "Correctability", "Neutrality"))

We get this error: Error in ggplot2::scale_x_continuous(): ! breaks and labels have different lengths.

When using scale_x_discrete and the same labels, the plot appears so:

Apologies if I am missing something very obvious here.

That’s strange, you seem to be doing the same thing I did but getting an error where I didn’t. What happens if you add a breaks argument too? Something like this:

scale_x_continuous(
  breaks = 0:8,
  labels = c(
    "(Intercept)",
    "Trust",
    "Voice",
    "Transparency",
    "Interpersonal treatment",
    "Accountability",
    "Decision control",
    "Correctability",
    "Neutrality"
  )
)

Thanks but that solution didn’t work for me - I received the same error Error in ggplot2::scale_x_continuous(): ! breaks and labels have different lengths.

Solution: plot + ggplot2::scale_x_continuous(breaks = c(0,1,2,3,4,5,6,7,8), labels = c("(Intercept)", "Trust", "Voice", "Transparency", "Interpersonal treatment", "Accountability", "Decision control", "Correctability", "Neutrality")) (i.e., define breaks - all predictors plus intercept - and then the labels).

This produces:

If anyone else wants to, labels can be altered in the usual way using labs, like so:
plot + labs(title = "", subtitle = "", y = "Difference vs. reference model"), producing:

Thanks for your help @jonah!

3 Likes

Just wanted to update for anyone trying to modify labels on other plot objects from the projpred package.

When we create for example plot(<cv_proportions>), we get:

To modify the labels on the y-axis and remove the y-axis title, you can use:
plot + ggplot2::scale_y_discrete(limits = rev, labels = c("Neutrality", "Correctability", "Decision control", "Accountability", "Interpersonal treatment", "Transparency", "Voice", "Trust")) + labs(y = ""), which gives:


For some reason when you supply labels in the order of the preferred predictors, it reverses the order of the matrix. However, it makes sense to have the predictor preferred for submodel size 1 on the top as in the original, hence the need for limits=rev and supplying the labels in reverse order.

3 Likes