at the beginning I'd like to note that I found multiple solutions on SO but none of them met my expectations.
I have to DF's:
1.
E F G H
chr1_100203723_100203724 NA NA NA
chr1_100212951_100212952 rs760764323 A,G, 0.000008,0.999992,
chr1_10032235_10032236 NA NA NA
chr1_100327060_100327061 NA NA NA
chr1_100346889_100346890 NA NA NA
chr1_100347237_100347238 rs749372877 C,G,T, 0.000008,0.000008,0.999983,
chr1_100357190_100357191 NA NA NA
chr1_100358057_100358058 NA NA NA
chr2_182852606_182852607 NA NA NA
chr2_202492077_202492078 NA NA NA
chr2_203760838_203760839 NA NA NA
chr2_215976351_215976352 NA NA NA
chr2_220354644_220354645 NA NA NA
chr2_234749403_234749404 NA NA NA
chr2_11802110_11802111 NA NA NA
chr2_31167747_31167748 NA NA NA
2.
E F G H
chr1_100203723_100203724 NA NA NA
chr1_100212951_100212952 NA NA NA
chr1_10032235_10032236 NA NA NA
chr1_100327060_100327061 NA NA NA
chr1_100346889_100346890 NA NA NA
chr1_100347237_100347238 NA NA NA
chr1_100357190_100357191 NA NA NA
chr1_100358057_100358058 NA NA NA
chr2_182852606_182852607 rs773426830 C,T, 0.999967,0.000033,
chr2_202492077_202492078 rs750583431 C,G, 0.000013,0.999987,
chr2_203760838_203760839 NA NA NA
chr2_215976351_215976352 rs113648834 C,T, 0.999934,0.000066,
chr2_220354644_220354645 NA NA NA
chr2_234749403_234749404 NA NA NA
chr2_11802110_11802111 rs371327070 A,G, 0.000044,0.999956,
chr2_31167747_31167748 rs201375957 A,C,T, 0.000008,0.999887,0.000105,
Desired output:
E F G H
chr1_100203723_100203724 NA NA NA
chr1_100212951_100212952 rs760764323 A,G, 0.000008,0.999992,
chr1_10032235_10032236 NA NA NA
chr1_100327060_100327061 NA NA NA
chr1_100346889_100346890 NA NA NA
chr1_100347237_100347238 rs749372877 C,G,T, 0.000008,0.000008,0.999983,
chr1_100357190_100357191 NA NA NA
chr1_100358057_100358058 NA NA NA
chr2_182852606_182852607 rs773426830 C,T, 0.999967,0.000033,
chr2_202492077_202492078 rs750583431 C,G, 0.000013,0.999987,
chr2_203760838_203760839 NA NA NA
chr2_215976351_215976352 rs113648834 C,T, 0.999934,0.000066,
chr2_220354644_220354645 NA NA NA
chr2_234749403_234749404 NA NA NA
chr2_11802110_11802111 rs371327070 A,G, 0.000044,0.999956,
chr2_31167747_31167748 rs201375957 A,C,T, 0.000008,0.999887,0.000105,
As you see DF1 is updated by DF2 columns F, G, H, where column E is my unique index. I tried to do merge() but this function didin't update my rows, it jus added columns of DF2 to DF1. I also tried updating with data.table and tidyverse, and my rows have been updated but other ones went to NAs... Finally I decided to do simple lapply() with nested ifelse(), however I don't know how to update all three columns simultaneously, what is more this is terrible slow for my over 50000 rows of data in each DF...
What I did so far:
DF1$F <- sapply(1:nrow(DF1), function(i) ifelse(DF1[i,1]==DF2[i,1] & is.na(DF1[i,1]), DF2[i,1], DF[i,1]))
You cand do this in base R:
as.data.frame(Map(function(x,y) ifelse(is.na(x),y,x),DF1,DF2))
with library purrr you can have a prettier more compact form (see Soto's answer for an even more compact one with dplyr):
library(purrr)
map2_df(DF1,DF2,~ifelse(is.na(.x),.y,.x))
In both cases (technically a data.frame in the first case and a tibble in the second case):
output
E F G H
1 chr1_100203723_100203724 <NA> <NA> <NA>
2 chr1_100212951_100212952 rs760764323 A,G, 0.000008,0.999992,
3 chr1_10032235_10032236 <NA> <NA> <NA>
4 chr1_100327060_100327061 <NA> <NA> <NA>
5 chr1_100346889_100346890 <NA> <NA> <NA>
6 chr1_100347237_100347238 rs749372877 C,G,T, 0.000008,0.000008,0.999983,
7 chr1_100357190_100357191 <NA> <NA> <NA>
8 chr1_100358057_100358058 <NA> <NA> <NA>
9 chr2_182852606_182852607 rs773426830 C,T, 0.999967,0.000033,
10 chr2_202492077_202492078 rs750583431 C,G, 0.000013,0.999987,
11 chr2_203760838_203760839 <NA> <NA> <NA>
12 chr2_215976351_215976352 rs113648834 C,T, 0.999934,0.000066,
13 chr2_220354644_220354645 <NA> <NA> <NA>
14 chr2_234749403_234749404 <NA> <NA> <NA>
15 chr2_11802110_11802111 rs371327070 A,G, 0.000044,0.999956,
16 chr2_31167747_31167748 rs201375957 A,C,T, 0.000008,0.999887,0.000105,
data
DF1 <- read.table(text="E F G H
chr1_100203723_100203724 NA NA NA
chr1_100212951_100212952 rs760764323 A,G, 0.000008,0.999992,
chr1_10032235_10032236 NA NA NA
chr1_100327060_100327061 NA NA NA
chr1_100346889_100346890 NA NA NA
chr1_100347237_100347238 rs749372877 C,G,T, 0.000008,0.000008,0.999983,
chr1_100357190_100357191 NA NA NA
chr1_100358057_100358058 NA NA NA
chr2_182852606_182852607 NA NA NA
chr2_202492077_202492078 NA NA NA
chr2_203760838_203760839 NA NA NA
chr2_215976351_215976352 NA NA NA
chr2_220354644_220354645 NA NA NA
chr2_234749403_234749404 NA NA NA
chr2_11802110_11802111 NA NA NA
chr2_31167747_31167748 NA NA NA",header=T,stringsAsFactors=F)
DF2 <- read.table(text="E F G H
chr1_100203723_100203724 NA NA NA
chr1_100212951_100212952 NA NA NA
chr1_10032235_10032236 NA NA NA
chr1_100327060_100327061 NA NA NA
chr1_100346889_100346890 NA NA NA
chr1_100347237_100347238 NA NA NA
chr1_100357190_100357191 NA NA NA
chr1_100358057_100358058 NA NA NA
chr2_182852606_182852607 rs773426830 C,T, 0.999967,0.000033,
chr2_202492077_202492078 rs750583431 C,G, 0.000013,0.999987,
chr2_203760838_203760839 NA NA NA
chr2_215976351_215976352 rs113648834 C,T, 0.999934,0.000066,
chr2_220354644_220354645 NA NA NA
chr2_234749403_234749404 NA NA NA
chr2_11802110_11802111 rs371327070 A,G, 0.000044,0.999956,
chr2_31167747_31167748 rs201375957 A,C,T, 0.000008,0.999887,0.000105,",header=T,stringsAsFactors=F)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With