Finally A Way to Model Discrete Parameters in Stan

howard · December 12, 2017, 3:47pm

Well this is very interesting. I replicated the toy example from your paper, “Sparsity information and regularization in the horseshoe and other shrinkage priors”, Juho Piironen and Aki Vehtari. I believe this paper describes the state of art in Stan for feature selection in a sparsity problem.

To summarize the toy problem challenge for readers:

We have 100 data points.
Each point consists of a vector of 400 measurements or features.
20 of these measurements are signal with a distribution of Normal(A,1) and the rest are noise with a distribution of Normal(0,1), for some value A.

The goal is to estimate a 400-element vector, “beta”, consisting of the means of the noise and signal measures. Hopefully, 20 of the estimated beta values will be A and 380 will be zero.

The “Sparsity” paper uses a horseshoe prior to estimate the betas and calculates the MSE of the estimated mean beta vector against actual for different values of A. From the “Sparsity” paper, MSEs for different A are (the “tau 0” line represents the horseshoe prior results):

I duplicated the above chart using a Rebar model. The code is shown in the files “sparse_01.R” and “sparse_02.stan” in the github: https://github.com/howardnewyork/rebar

The results using Rebar for the first six A values:

mse

This shows a reduction in MSE but two orders of magnitude! Given the magnitude of the improvement, I am concerned I missed something in the setup of the toy problem, but if not, these are very promising results.

Here for example, are the estimated beta values using horseshoe prior and Rebar

Horseshoe Prior: Estimated Beta from “Sparsity” Paper: A=6

sparsity_beta
(red = true, solid black is estimated beta)

Rebar Method: Estimated Beta using Rebar: A=6

A_6

The “Adjusted Beta” is beta .* one_hot. Rebar does better at identifying noise features and shrinking them to zero and not shrinking the signal features towards zero.

As can be seen, it looks like Rebar is a very promising tool for feature selection in Stan.

(With regard to how to choose a value of tau in Rebar, I do not have any particular scientific insight. I followed the heuristic of setting it to a small enough value to force the Rebar parameters to be close to zero or 1. Too small a value will slow down the MCMC and require higher adapt_delta and tree depth values, so you have to pick a reasonable value that works and the model is not too slow.)

Topic		Replies	Views
A question on "stan cannot deal with discrete parameters" Developers	19	6917	February 28, 2018
Discrete parameter in Stan Developers	14	5667	February 7, 2019
Marginalization of latent discrete states Modeling techniques , ecology	17	5458	September 15, 2017
Marginalizing over latent discrete parameters Modeling	20	1391	November 11, 2021
Estimation, not marginalisation, of discrete parameters: a solution? General discrete-parameters	3	748	May 18, 2022

Finally A Way to Model Discrete Parameters in Stan

Related topics