CV Varsel Error: Infinite or missing values in 'x'

l-gorman · June 10, 2023, 8:42am

Hi again!

So… apologies for yet another query, and thanks for all the help you’ve provided before. I am running a hierarchical model with 30,000 observations. I am using projpred for variable selection, more details on the model I am running can be found in this thread

I was wondering whether anyone would be able to interpret what might be going on with this error (apologies again for my lack of understanding).

*** EDIT *** I tried to upload my reference model (a .RDA file), but was unable to on Discourse. If anyone would like access I am happy to find another means of sending. I am running projpred version 2.6.0, brms version 2.19.0, R version 4.1***

The desired updates require recompiling the model
Start sampling
-----
-----
Running the search and the performance evaluation for each of the K = 5 CV folds separately ...
  |                                                                      |   0%Error in eigen(sigma, symmetric = TRUE) :
  infinite or missing values in 'x'
Calls: cv_varsel ... repair_re -> repair_re.merMod -> t -> <Anonymous> -> eigen
In addition: Warning messages:
1: Quick-TRANSfer stage steps exceeded maximum (= 800000)
2: Quick-TRANSfer stage steps exceeded maximum (= 800000)

This happens after the reference models are refit on each of the 5 folds. Please let me know if you need any more information/context!

Thanks again,

Léo

fweber144 · June 12, 2023, 7:14am

Hi Léo,

Sorry to hear that you are having more issues with projpred. The issue you mentioned might be related to the large dataset, but I’m not sure and would have to try out. Since you said you were unable to upload the reference model object as an RDA file, could you send it to me via a personal message?

l-gorman · June 12, 2023, 7:48am

Thanks again for getting in touch! I’ve sent you a personal message!

fweber144 · June 12, 2023, 12:50pm

Thanks for sending. I am able to load the reference model fit into my R session, but already calling get_refmodel() crashes my R session due to insufficient RAM. Are you calling get_refmodel() also on the HPC cluster you mentioned in Projpred: Fixing Group Effects in Search Terms and Tips for Speed??

fweber144 · June 12, 2023, 12:51pm

If yes, could you perhaps create a more lightweight reference model fit? I noticed this one has 16 000 posterior draws.

l-gorman · June 12, 2023, 1:01pm

Yes I am running this on HPC cluster also. I will rerun and send through a more lightweight model. Apologies!

l-gorman · June 12, 2023, 3:20pm

Rerun done and I’ve sent that through!

fweber144 · June 15, 2023, 11:52am

For those stumbling across this: With the reprex from GitHub - l-gorman/projpred_issue_reprex: A reproducible example for an issue being encountered in the ProjPred Package (which features a smaller dataset, a smaller reference model, and 2-fold CV), we received the error

Error in if (any(edgevals <- 0 < bdiff & bdiff < boundary.tol)) { :
  missing value where TRUE/FALSE needed

It turned out that this is due to #323 (for which I will add a warning in projpred) and can be avoided by setting nterms_max = 16 (in this reference model, there are 17 predictor terms when not counting the intercept and when counting the two group-level terms as one (because their inclusion is forced in a common fashion by combining them via +), so the full model has 17 predictor terms and we then want to cut off the search at 17 - 1 = 16 terms).

Furthermore, issue #346 is probably one of the reasons why the computations take so long in this case (in projpred:::search_forward(), it is already the creation of the list of candidate models—for a given submodel size—which takes very long).

fweber144 · June 15, 2023, 11:53am

I couldn’t observe Error in eigen(sigma, symmetric = TRUE) : infinite or missing values in 'x' with the reprex from GitHub - l-gorman/projpred_issue_reprex: A reproducible example for an issue being encountered in the ProjPred Package (when using nterms_max = 16), but feel free to post here if it still occurs.

l-gorman · June 15, 2023, 12:04pm

Thanks again for all of the help @fweber144!

Ah I see, that makes sense. I will keep an eye out to see if there are any fixes on this in the future :)

I also could not reproduce this error in the smaller example. I am rerunning the model with nterms_max on the larger dataset, and I will let you know if I encounter this again (and if so will try to make a reprex that captures it).

Thanks again!

l-gorman · June 16, 2023, 1:14pm

Hi @fweber144!

A bit of progress, but apologies to again be the bearer of more queries!

So, now all models get past through the search, i.e.:

Running the search and the performance evaluation for each of the K = 5 CV folds separately ...
  |======================================================================| 100%
-----

Except, immediately after this, I get the following Error:

Error in simplify2array(lapply(res_cv, "[[", "summaries_sub"), higher = FALSE,  :
  unused argument (except = NULL)
Calls: cv_varsel -> cv_varsel.refmodel -> kfold_varsel

The error seems to be occuring here. This error only occurs when running the reprex on HPC. I have done a bit of digging. The HPC has R version 4.1.0 (can’t upgrade to 4.2 unfortunately) and my laptop has R version 4.2.0.

For the simplify2array, the except argument was only added on for R version 4.2.0. I am happy to create my own fork, remove the except argument, and see how that works. I was just wondering whether you know what the implications might be?

Thanks again,

Léo

fweber144 · June 19, 2023, 5:35am

Hi Léo,

Thank you for reporting this! I will respond in Incompatability with R version <4.2.0 · Issue #423 · stan-dev/projpred.

Best,
Frank

fweber144 · September 6, 2023, 7:46pm

projpred’s GitHub issue #346 has now been fixed by the addition of a helper function called force_search_terms().

Topic		Replies	Views
Projpred with brms object General	11	1255	February 1, 2022
Accounting for measurement error during variable selection with projpred (possibly with rstanarm or brms?) Modeling	23	1085	July 3, 2025
Errors running cv_varsel , and a search terms question Modeling projpred	14	889	June 23, 2024
Error when running cv_varsel Modeling projpred	5	1002	March 15, 2022
Error in get_refmodel and cv_varsel rstanarm	18	1551	February 10, 2022

CV Varsel Error: Infinite or missing values in 'x'

Related topics