Which rows of my dataset are used by the model?

Perhaps a stupid question but I could not find a reply anywhere.

My dataset has over 2000 rows.
In the model the number of observations is a little over a 1000.
I get the warning message (Rows with NAs have been excluded)
How do I know exactly which rows have been used by the model?
I tried complete.cases of my dataset but it does not match.

(I am asking because I have both linear and categorical predictors and to plot from the fitted() I would need to know which observation belong to which category)

Cheers

Hi,

are NAs coded as NA only, or do you have empty cells to signify NAs, or something else?

foo <- data[complete.cases(data), ]

Should give you only the cleaned data. However, note what ?complete.cases tells you:

A current limitation of this function is that it uses low level functions to determine lengths and missingness, ignoring the class. This will lead to spurious errors when some columns have classes with length or is.na methods, for example "POSIXlt", as described in PR#16648.

2 Likes