merge dataframes based on common columns but keeping all rows from x [duplicate]

Question

I need to merge two dataframes x and y which have about 50 columns in common and some unique columns, and I need to keep all the rows from x.

It works if I run:

 NewDataframe <- merge(x, y, by=c("ColumnA", "ColumnB", "ColumnC"),all.x=TRUE)

The issue is that there are more than 50 common columns, and I would rather avoid typing the names of all the common columns.

I have tried with:

 NewDataframe <- merge(x, y, all.x=TRUE)

But the following error appears:

 Error in merge.data.table(x, y, all.x = TRUE) :
 Elements listed in `by` must be valid column names in x and y

Is there any way of using by with the common columns without typing all of them, but keeping all the rows from x?

Thank you.

zelite · Accepted Answer

You want to merge based on all common columns. So first you need to find out which column names are common between the two dataframes.

common_col_names <- intersect(names(x), names(y))

Then you use this character vector as your by parameters in the merge function.

merge(x, y, by=common_col_names, all.x=TRUE)

Edit: after reading @Andrew Gustar's answer, I double checked the documentation for the merge function, and this is exactly the default by parameter:

## S3 method for class 'data.frame'
merge(x, y, by = intersect(names(x), names(y)), # <-- Look here
      by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all,
      sort = TRUE, suffixes = c(".x",".y"),
      incomparables = NULL, ...)

merge dataframes based on common columns but keeping all rows from x [duplicate]

Tags:

merge

dataframe

r

dede

1 Answers

zelite

Recent Activity

Donate For Us

merge dataframes based on common columns but keeping all rows from x [duplicate]

Tags:

merge

dataframe

r

dede

1 Answers

zelite

Related questions

Recent Activity

Donate For Us