Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Paste together columns but ignore NAs

Tags:

dataframe

r

paste

I want to paste together multiple columns but ignore NAs.

Here's a basic working example of what the df looks like and what I'd like it to look like. Does anyone have any tips?

df <- data.frame("col1" = c("A", NA, "B", "C"),
                 "col2" = c(NA, NA, NA, "E"),
                 "col3" = c(NA, "D", NA, NA),
                 "col4" = c(NA, NA, NA, NA))

df_fixed <- data.frame("col" = c("A", "D", "B", "C,E"))
like image 391
user9974638 Avatar asked Sep 07 '25 02:09

user9974638


2 Answers

We may use unite which can have na.rm as argument

library(tidyr)
library(dplyr)
df %>% 
   unite(col, everything(), na.rm = TRUE, sep=",")

-output

  col
1   A
2   D
3   B
4 C,E

Or using base R with do.call and trimws

data.frame(col = trimws(do.call(paste, c(df, sep = ",")),
      whitespace = "(?:,?NA,?)+"))

-output

  col
1   A
2   D
3   B
4 C,E
like image 183
akrun Avatar answered Sep 08 '25 15:09

akrun


Using paste.

data.frame(col1=sapply(apply(df, 1, \(x) x[!is.na(x)]), paste, collapse=','))
#   col1
# 1    A
# 2    D
# 3    B
# 4  C,E

Or without apply:

data.frame(col1=unname(as.list(as.data.frame(t(df))) |>
             (\(x) sapply(x, \(x) paste(x[!is.na(x)], collapse=',')))()))
#   col1
# 1    A
# 2    D
# 3    B
# 4  C,E

To add as a column use transform.

transform(df, colX=sapply(apply(df, 1, \(x) x[!is.na(x)]), paste, collapse=','))
#   col1 col2 col3 col4 colX
# 1    A <NA> <NA>   NA    A
# 2 <NA> <NA>    D   NA    D
# 3    B <NA> <NA>   NA    B
# 4    C    E <NA>   NA  C,E

Note: Actually, you also could replace \(x) x[!is.na(x)] by na.omit, since it's attributes vanish; see e.g. @ G. Grothendieck's answer.

like image 20
jay.sf Avatar answered Sep 08 '25 14:09

jay.sf