Latent indicator variable for each regression coefficient

drjphughesjr · April 1, 2024, 6:15pm

Dear Stan community,

I would like to include a latent indicator variable for each independent variable in my regression model. The indicator variables have independent Bernoulli priors, where the success probabilities are iid standard uniform random variables, say. This is a way of doing variable selection, wherein one examines the posterior distributions of the success probabilities. How can I implement such a model in Stan?

My current implementation, which of course doesn’t work, includes the following syntax.

y ~ bernoulli_logit(beta0 + (X .* rep_matrix(indicators, n)) * beta);

Here ‘X’ is the design matrix, ‘indicators’ is the vector of indicator variables, and ‘beta’ are the regression coefficients. It seems sensible to compute the Hadamard product (X .* rep_matrix(indicators, n)) to select, for the current iteration, some subset of the explanatory variables.

The rest of my model is implemented satisfactorily, but I’m stuck on this part. Many thanks in advance for your guidance.

Warmly,

John

mhollanders · April 1, 2024, 10:12pm

Hey John, those would be latent discrete parameters, which Stan doesn’t allow. I think for variable selection you could try model stacking with loo and/or implementing R2D2 priors.

Topic		Replies	Views
Estimate an indicator variable within a linear regression Modeling	8	1605	June 15, 2018
Specify product bernouli prior for selecting covariates into the model Modeling	2	174	January 9, 2024
Include extra coefficient multiplied with regression weight ("inclusion weight" à la Kruschke) brms	7	1126	March 3, 2021
Latent variable inference in regression model Modeling rstan	2	506	November 11, 2022
Dinamic prior based on categorical information Modeling rstan , techniques	0	286	February 7, 2023

Latent indicator variable for each regression coefficient

Related topics