Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inner Join with conditions in R

I want to do inner join with the condition that it should give me subtraction of 2 columns.

df1 = data.frame(Term = c("T1","T2","T3"), Sec = c("s1","s2","s3"), Value =c(10,30,30))

df2 = data.frame(Term = c("T1","T2","T3"), Sec = c("s1","s3","s2"), Value =c(40,20,10)

 df1
 Term Sec Value
  T1  s1    10
  T2  s2    30
  T3  s3    30

  df2
  Term  Sec Value
  T1  s1    40
  T2  s3    20
  T3  s2    10

The result I want is

  Term  Sec Value
   T1   s1   30
   T2   s2   20
   T3   s3   10

Basically I am joining two tables and for the column value I am taking

Value=  abs(df1$Value - df2$Value)

I have struggled but could not found any way to do this conditional merge in base R. Probably if it is not possible with base R, dplyr should able to do that with inner_join() but I am not well aware with much of this package.

So, any suggestion with base R and/or dplyr will be appreciated

EDITING

I have included my original data as asked. My data is here

https://jsfiddle.net/6z6smk80/1/

DF1 is first table and DF2 is second. DF2 starts from 168th row.

All logic same , I want to join these two tables whose length is 160 rows each. I want to join by ID and take difference of column Value from both tables. The resultant dataset should have same number of rows which is 160 with extra column diff

like image 806
user3050590 Avatar asked Dec 04 '25 09:12

user3050590


2 Answers

Using data.tables binary join you can modify columns while joining. nomatch = 0L makes sure that you are doing an inner join

library(data.table)
setkey(setDT(df2), Sec)
setkey(setDT(df1), Sec)[df2, .(Term, Sec, Value = abs(Value - i.Value)), nomatch = 0L]
#    Term Sec Value
# 1:   T1  s1    30
# 2:   T2  s2    20
# 3:   T3  s3    10
like image 122
David Arenburg Avatar answered Dec 06 '25 01:12

David Arenburg


Here is a "base R" solution using the merge() function on the Term column shared by your original df1 and df2 data frames:

df_merged <- merge(df1, df2, by="Sec")
df_merged$Value <- abs(df_merged$Value.x - df_merged$Value.y)
df_merged <- df_merged[, c("Sec", "Term.x", "Value")]
names(df_merged)[2] <- "Term"

> df_merged
  Sec Term Value
1  s1   T1    30
2  s2   T2    20
3  s3   T3    10
like image 41
Tim Biegeleisen Avatar answered Dec 05 '25 23:12

Tim Biegeleisen