Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a logical vector that indicates whether the values in two columns are the same across categorical factors in R?

Tags:

r

data.table

I'm trying my best to articulate this, so here goes.

I have a table of gene information. However, I am going to be using a generic example for the sake of this question.

> test_dt <- data.table(c("b", "a", "a", "b"), c(1, 4, 1, 5), c(4, 6, 4, 8))
> colnames(test_dt) <- c("category", "start", "end")
> test_dt
   category start end
1:        b     1   4
2:        a     4   6
3:        a     1   4
4:        b     5   8

I want to append an additional column to this table that indicates whether start and end are the same across different category values (in my case as well as in this example, I am only dealing with two categories):

   category start end in_both
1:        b     1   4    TRUE
2:        a     4   6   FALSE
3:        a     1   4    TRUE
4:        b     5   8   FALSE

I know this seems painfully basic but there are holes in my R knowledge that periodically need to be filled and paved over. How would I go about doing this?

like image 988
CelineDion Avatar asked Dec 05 '25 18:12

CelineDion


1 Answers

One option could be:

test_dt[, in_both := uniqueN(category) == 2, by = c("start", "end")]

   category start end in_both
1:        b     1   4    TRUE
2:        a     4   6   FALSE
3:        a     1   4    TRUE
4:        b     5   8   FALSE
like image 77
tmfmnk Avatar answered Dec 08 '25 11:12

tmfmnk



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!