How can I filter by NAs in R programming with Dplyr

Question

I'm trying to filter by NAs (just keep the rows with NA in the specified column) by using Dplyr and the filter function. Using the code below, is just returning the column labels with no data. Am I writing the code correctly? Also, if it's possible (or easier) to do without dplyr that'd be interesting to know as well. Thanks.

filter(tata4, CompleteSolution == "NA", KeptInformed == "NA")

Steven Beaupré · Accepted Answer

You could use complete.cases()

dplyr::filter(df, !complete.cases(col1, col2))

Which gives:

#  col1 col2 col3
#1   NA    5    5
#2   NA    6    6
#3    5   NA    7

Benchmark

large_df <- df[rep(seq_len(nrow(df)), 10e5), ]

The results so far:

library(microbenchmark)
mbm <- microbenchmark(
  akrun1 = large_df[rowSums(is.na(large_df[1:2]))!=0, ],
  akrun2 = large_df[Reduce(`|`, lapply(large_df[1:2], is.na)), ],
  steven = filter(large_df, !complete.cases(col1, col2)),
  times = 10)

enter image description here

#Unit: milliseconds
#   expr      min       lq      mean    median        uq       max neval cld
# akrun1 814.0226 924.0837 1248.9911 1208.7924 1434.2415 2057.1338    10   c
# akrun2 499.3404 671.9900  736.2418  687.9194  861.4477 1068.1232    10  b 
# steven 112.9394 113.0604  214.1688  198.4542  299.7585  355.1795    10 a

Data

df <- structure(list(col1 = c(1, 2, 3, 4, NA, NA, 5), col2 = c(1, 2, 
3, 4, 5, 6, NA), col3 = c(1, 2, 3, 4, 5, 6, 7)), .Names = c("col1", 
"col2", "col3"), row.names = c(NA, -7L), class = "data.frame")

How can I filter by NAs in R programming with Dplyr

Tags:

r

na

dplyr

Stephertless

1 Answers

Steven Beaupré

Recent Activity

Donate For Us

How can I filter by NAs in R programming with Dplyr

Tags:

r

na

dplyr

Stephertless

1 Answers

Steven Beaupré

Related questions

Recent Activity

Donate For Us