Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: check if the column value falls between a range

I have two data frames.

Frame-1

Dataframe One

Frame-2

Dataframe 2

  • Columns From and to of data.frame.2 are the ranges in which column Step of the data.frame.1 are present.

I want to combine these two data frames using the Steps with their range (From and to), and then assign them new_id. Both data frames have different numbers of rows.

Example

# DataFrame1
df1 <- data.frame(Step = c(1:10), id = paste0("id_", c(1:10)))

# DataFrame2
df2 <- data.frame(from = seq(1,10,2), to = seq(3,12,2), new_id = paste0("newLabel_", c(1:5)))

What I tried

df1$label <- ifelse(sapply(df1$Step, function(p) 
                 any(df2$from > p & df2$to < p)), 
                 df2$new_id, NA)

Error

I get NAs instead of labels.

enter image description here

like image 836
sp29 Avatar asked Dec 05 '25 15:12

sp29


1 Answers

With dplyr version >= 1.1.0, we can use non-equi join with join_by

library(dplyr)
left_join(df1, df2, by = join_by(closest(Step >= from), closest(Step <= to))) %>%
   select(names(df1), new_id)

-output

 Step    id     new_id
1     1  id_1 newLabel_1
2     2  id_2 newLabel_1
3     3  id_3 newLabel_2
4     4  id_4 newLabel_2
5     5  id_5 newLabel_3
6     6  id_6 newLabel_3
7     7  id_7 newLabel_4
8     8  id_8 newLabel_4
9     9  id_9 newLabel_5
10   10 id_10 newLabel_5

Or using data.table

library(data.table)
setDT(df1)[df2, new_id := i.new_id, on = .(Step >= from, Step <= to)]

-output

> df1
    Step    id     new_id
 1:    1  id_1 newLabel_1
 2:    2  id_2 newLabel_1
 3:    3  id_3 newLabel_2
 4:    4  id_4 newLabel_2
 5:    5  id_5 newLabel_3
 6:    6  id_6 newLabel_3
 7:    7  id_7 newLabel_4
 8:    8  id_8 newLabel_4
 9:    9  id_9 newLabel_5
10:   10 id_10 newLabel_5
like image 80
akrun Avatar answered Dec 08 '25 05:12

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!