Adjustment for testing multiple associations

Thank you so much for this elaborate and helpful advice! Upon reading the suggested work, I also came across the Iterative Supervised Principal Component (ISPC) approach, which could also resolve another issue I am having with PCA (see forum topic here).

In the following, one comment/clarification and quick follow-up question if I understood your suggestion with the hierarchical model correctly.

They are highly correlated. But I am still hesitant with approaches to feature selection. Say I have two strongly correlated features having similar associations with Y. Without knowing the causal link, both associations seem valid. For instance, I have two variables that are both associated with the target: GDP and a government effectiveness indicator. Both variables are also highly correlated (high income countries usually have more effective governments). A minimum feature set would probably only consider GDP, yet my focus is not on prediction but to present both plausible associations of the target (with income and/or with government effectiveness). I feel a ISPC approach might be suitable for this, although it could be a good additional analysis to present results from a sparse model to show the most relevant predictors.

If I understand correctly, that is something I thought of as well. So basically the idea would be to form e.g. two groups: 1) variables I expect positive associations, 2) variables I expect negative associations. Then estimate a hierarchical model partially pooling the estimates of individual variables within groups 1) and 2). Is that what you meant?

Thanks once again for your time and help!

1 Like