Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: more efficient solution than this for-loop

I wrote a functioning for loop, but it's slow over thousands of rows and I'm looking for more efficient alternative. Thanks in advance!

The task:

  • If column a matches column b, column d becomes NA.
  • If column a does not match b, but b matches c, then column e becomes NA.

The for loop:

for (i in 1:nrow(data)) {
     if (data$a[i] == data$b[i]) {data$d[i] <- NA}
     if (!(data$a[i] == data$b[i]) & data$b[i] == data$c[i])
        {data$e[i] <- NA}
}

An example:

a    b    c    d    e
F    G    G    1    10
F    G    F    5    10
F    F    F    2    8

Would become:

a    b    c    d    e
F    G    G    1    NA
F    G    F    5    10
F    F    F    NA    8
like image 379
Jautis Avatar asked Nov 20 '25 15:11

Jautis


2 Answers

If you're concerned about speed and efficiency, I'd recommend data.table (though technically vectorizing a normal data.frame as recommended by @parfait would probably speed things up more than enough)

library(data.table)

DT <- fread("a    b    c    d    e
             F    G    G    1    10
             F    G    F    5    10
             F    F    F    2    8")
print(DT)
#    a b c d  e
# 1: F G G 1 10
# 2: F G F 5 10
# 3: F F F 2  8

DT[a == b, d := NA]
DT[!a == b & b == c, e := NA]

print(DT)
#    a b c  d  e
# 1: F G G  1 NA
# 2: F G F  5 10
# 3: F F F NA  8
like image 77
Matt Summersgill Avatar answered Nov 22 '25 05:11

Matt Summersgill


Suppose df is your data then:

ab <- with(df, a==b)
bc <- with(df, b==c)

df$d[ab] <- NA
df$e[!ab & bc] <- NA

which would result in

#   a b c  d  e
# 1 F G G  1 NA
# 2 F G F  5 10
# 3 F F F NA  8
like image 28
989 Avatar answered Nov 22 '25 05:11

989



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!