Replace NA values in R dataframe across multiple columns using truncated names of other columns [duplicate]

Question

I have the following data frame (example):

myfile <- data.frame(C1=c(1,3,4,5),
                     C2=c(5,4,6,7),
                     C3=c(0,1,3,2),
                     C1_A=c(NA,NA,1,2),
                     C2_A=c(NA,9,8,7),
                     C3_A=c(NA,NA,NA,1))

I would like to replace all NA values under the last 3 "_A" columns with the respective same row value from columns C1 to C3. for example C1_A to be 1,3,1,2

I tried the following line

myfile <- myfile %>% mutate(across(c(C1_A:C3_A), ~ if_else(is.na(.)==TRUE, eval(parse(text=str_replace(., "_A", ""))), .)))

but is not working and returns the bottom row value of the _A columns. Also tried it with the rowwise dplyr option, but still no success.

My real dataset has several columns like the example, so doesn't make sense to mutate each individually. How best to resolve this?

tmfmnk · Accepted Answer

An option with tidyverse:

myfile %>%
 mutate(across(ends_with("_A"), ~ if_else(is.na(.), get(str_remove(cur_column(), "_A")), .)))

  C1 C2 C3 C1_A C2_A C3_A
1  1  5  0    1    5    0
2  3  4  1    3    9    1
3  4  6  3    1    8    3
4  5  7  2    2    7    1

margusl · Answer

If there's a set of complete columns followed by a matching set of incomplete columns, we could naively locate NA indices (1), get matching source / patch value indices by subtracting number of columns in a set from index col (2) and update NA locations (3):

myfile <- data.frame(C1=c(1,3,4,5),
                     C2=c(5,4,6,7),
                     C3=c(0,1,3,2),
                     C1_A=c(NA,NA,1,2),
                     C2_A=c(NA,9,8,7),
                     C3_A=c(NA,NA,NA,1))
# 1 - get NA locations
( na_idx <- src_idx <- which(is.na(myfile), arr.ind = TRUE) )
#>      row col
#> [1,]   1   4
#> [2,]   2   4
#> [3,]   1   5
#> [4,]   1   6
#> [5,]   2   6
#> [6,]   3   6

# 2 - update index col
src_idx[,2] <- src_idx[,2] - 3
src_idx
#>      row col
#> [1,]   1   1
#> [2,]   2   1
#> [3,]   1   2
#> [4,]   1   3
#> [5,]   2   3
#> [6,]   3   3

# 3 - update values
myfile[na_idx] <- myfile[src_idx]
myfile
#>   C1 C2 C3 C1_A C2_A C3_A
#> 1  1  5  0    1    5    0
#> 2  3  4  1    3    9    1
#> 3  4  6  3    1    8    3
#> 4  5  7  2    2    7    1

^{Created on 2025-10-08 with reprex v2.1.1}

Replace NA values in R dataframe across multiple columns using truncated names of other columns [duplicate]

Tags:

dataframe

r

na

dplyr

JohnPat

2 Answers

tmfmnk

margusl

Recent Activity

Donate For Us

Replace NA values in R dataframe across multiple columns using truncated names of other columns [duplicate]

Tags:

dataframe

r

na

dplyr

JohnPat

2 Answers

tmfmnk

margusl

Related questions

Recent Activity

Donate For Us