Here is a puzzle.
Assume you have a data frame and a list. The list has as many elements as the df has rows:
dd <- data.frame(ID=1:3, Name=LETTERS[1:3])
dl <- map(4:6, rnorm) %>% set_names(letters[1:3])
Is there a simple way (preferably with dplyr / tidyverse) to make a long format, such that the elements of the list are joined with the corresponding rows of the data frame? Here is what I have in mind illustrated with not-so-elegant way:
rows <- map(1:length(dl), ~ rep(., length(dl[[.]]))) %>% unlist()
dd <- dd[rows,]
dd$value <- unlist(dl)
As you can see, for each vector in dl, we replicated the corresponding row as many times as necessary to accommodate each value.
In base R, you can get your result with stack followed by merge:
res <- merge(stack(dl), dd, by.x="ind", by.y="Name")
head(res)
# ind values ID
#1 A -0.79616693 1
#2 A 0.37720953 1
#3 A 1.30273712 1
#4 A 0.19483859 1
#5 B 0.18770716 2
#6 B -0.02226917 2
NB: I supposed the names for dl were supposed to be in uppercases but if they are indeed lowercase, the following line needs to be pass instead:
res <- merge(stack(setNames(dl, toupper(names(dl)))), dd, by.x="ind", by.y="Name")
Since a dplyr solution has already been provided, another option is to subset dl for each Name value in dd using data.table grouping
library(data.table)
setDT(dd)
dd[, .(values = dl[[tolower(Name)]]), by = .(ID, Name)]
# ID Name values
# 1: 1 A -1.09633600
# 2: 1 A -1.26238190
# 3: 1 A 1.15220845
# 4: 1 A -1.45741071
# 5: 2 B -0.49318131
# 6: 2 B 0.59912670
# 7: 2 B -0.73117632
# 8: 2 B -1.09646143
# 9: 2 B -0.79409753
# 10: 3 C -0.08205888
# 11: 3 C 0.21503398
# 12: 3 C -1.17541571
# 13: 3 C -0.10020616
# 14: 3 C -1.01152362
# 15: 3 C -1.03693337
We can create a list column and unnest
library(tidyverse)
dd %>%
mutate(value = dl) %>%
unnest
# ID Name value
#1 1 A 1.57984385
#2 1 A 0.66831102
#3 1 A -0.45472145
#4 1 A 2.33807619
#5 2 B 1.56716709
#6 2 B 0.74982763
#7 2 B 0.07025534
#8 2 B 1.31174561
#9 2 B 0.57901536
#10 3 C -1.36629653
#11 3 C -0.66437155
#12 3 C 2.12506187
#13 3 C 1.20220402
#14 3 C 0.10687018
#15 3 C 0.15973401
Note that if the criteria is based on the compactness of code, if we remove the %>%
unnest(mutate(dd, value = dl))
Or another option is uncount and mutate
dd %>%
uncount(lengths(dl)) %>%
mutate(value = flatten_dbl(unname(dl)))
If it needs a join based on the names of the 'dl'
enframe(dl, name = 'Name') %>%
mutate(Name = toupper(Name)) %>%
left_join(dd) %>%
unnest
In base R, we can replicate the rows of 'dd' with lengths of 'dl' and transform to create the 'value' as unlisted 'dl'
transform(dd[rep(seq_len(nrow(dd)), lengths(dl)),], value = unlist(dl))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With