I have multiple dataframes with varying column numbers and many thousands of rows. Every DF has Year and ISO columns. I want to merge all of them into a final DF that has a single Year and ISO columns and unique value column/s that correspond to those from the original DFs. The final output has to go back out in .xlsx format so I'd like to minimize the number of rows which are populated with mostly NA.
Here is a reproducible example:
library(ISOcodes)
df1 = data.frame(Year = sample(2000:2020, 10),
ISO = sample(ISO_3166_1$Alpha_2, 10),
value1 = sample(1:100, 10))
df2 = data.frame(Year = sample(2000:2020, 10),
ISO = sample(ISO_3166_1$Alpha_2, 10),
value2 = sample(1:100, 10))
df3 = data.frame(Year = sample(2000:2020, 10),
ISO = sample(ISO_3166_1$Alpha_2, 10),
value3 = sample(1:100, 10))
df4 = data.frame(Year = sample(2000:2020, 10),
ISO = sample(ISO_3166_1$Alpha_2, 10),
value4 = sample(1:100, 10))
df5 = data.frame(Year = sample(2000:2020, 10),
ISO = sample(ISO_3166_1$Alpha_2, 10),
value5 = sample(1:100, 10))
full_df = merge(df1, df2, by = c('Year', 'ISO'), all = T)
full_df = merge(full_df, df3, by = c('Year', 'ISO'), all = T)
full_df = merge(full_df, df4, by = c('Year', 'ISO'), all = T)
full_df = merge(full_df, df5, by = c('Year', 'ISO'), all = T)
I have to specify all = T
so that I don't lose data. This solution feels very clunky and I'm sure there must be more elegant ways to do it (possibly with data.table
?).
Thanks!
l <- list(df1, df2, df3, df4, df5)
purrr::reduce(.x = l, merge, by = c('Year', 'ISO'), all = T)
Year ISO value1 value2 value3 value4 value5
1 2000 KZ NA 75 NA NA NA
2 2000 TF NA NA NA NA 34
3 2001 AD NA NA NA NA 31
4 2001 ML NA NA 87 NA NA
5 2001 NF 8 NA NA NA NA
6 2002 CC NA NA NA 12 NA
7 2002 NF NA 63 NA NA NA
8 2002 SD 42 NA NA NA NA
9 2002 SY NA NA NA NA 45
10 2003 AW NA NA 41 NA NA
11 2003 BR NA NA NA 42 NA
12 2003 TT 96 NA NA NA NA
13 2004 KE NA 100 NA NA NA
14 2005 CG NA NA NA NA 67
15 2006 BV NA NA 67 NA NA
16 2006 BW NA 9 NA NA NA
17 2006 GU 18 NA NA NA NA
18 2007 CG NA NA NA 81 NA
19 2007 IM NA 18 NA NA NA
20 2008 BH NA NA 100 NA NA
21 2008 MD 28 NA NA NA NA
22 2008 PY NA NA NA NA 96
23 2008 TR NA NA NA 87 NA
24 2009 SM NA NA NA 53 NA
25 2010 GF NA NA 56 NA NA
26 2010 LU NA NA NA 43 NA
27 2010 PM NA 80 NA NA NA
28 2011 MF NA NA 38 NA NA
29 2012 CF NA 56 NA NA NA
30 2012 JO NA NA NA 16 NA
31 2012 UG NA NA 63 NA NA
32 2013 BY 68 NA NA NA NA
33 2013 MT NA 13 NA NA NA
34 2013 NO NA NA NA 74 NA
35 2014 TR NA NA NA NA 98
36 2015 GM NA NA NA NA 27
37 2015 SC 71 NA NA NA NA
38 2016 MM NA NA 65 NA NA
39 2017 AX 50 NA NA NA NA
40 2017 BA NA NA 8 NA NA
41 2017 MF NA NA NA NA 20
42 2017 SH NA NA NA 6 NA
43 2018 GU NA 46 NA NA NA
44 2018 LR NA NA NA NA 56
45 2018 MK 33 NA NA NA NA
46 2018 SN NA NA 82 NA NA
47 2019 NA 26 NA NA NA NA
48 2019 TW NA NA NA 51 NA
49 2020 AQ NA 28 NA NA NA
50 2020 FO NA NA NA NA 81
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With