I want to create a new dataframe in R that contains all the rows that do not have identical values in two given rows including those that are NA in one row and not in another.
So I want to subset this
person,opinion_1, opinion_2
a,agree,agree
b,disagree,agree
c,disagree,
d,disagree,
df_example <- structure(
list(
person = c("a", "b", "c", "d"),
opinion_1 = c("agree",
"disagree", "disagree", "disagree"),
opinion_2 = c("agree", "agree",
NA,NA)
),
row.names = c(NA, -4L),
class = c("data.table", "data.frame")
)
into this
person,opinion_1, opinion_2
b,disagree,agree
c,disagree,
d,disagree,
I have tried using x <- df[which(df$opinion_1 != df$opinion_2),] but this only returns
person,opinion_1, opinion_2
b,disagree,agree
Is there a solution so that the subset will include mismatched NAs?
We can do this using subsetting, the code below represents keep every row in which the opinion differs and additionally keep those rows that only have one opinion recorded:
dfn <- df[(df$opinion_1 != df$opinion_2) | is.na(df$opinion_1) | is.na(df$opinion_2),]
# results in:
person opinion_1 opinion_2
2 b disagree agree
3 c disagree <NA>
4 d disagree <NA>
Data
df <- read.table(text = "person opinion_1 opinion_2
a agree agree
b disagree agree
c disagree NA
d disagree NA", header = T, stringsAsFactors = F)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With