I have a data frame like this:
trans_id product_id
1 1 456
2 4 223
3 1 778
4 1 774
5 5 999
6 4 123
I need to convert it so that all trans_id are listed as a single row like this:
trans_id V1 V2 V3
1 456 778 774
4 223 123
5 999
You should add a secondary ID column. That's easy with getanID from my "splitstackshape" package. Since "splitstackshape" also loads "data.table", it's easy to then convert to a wide format using dcast.data.table:
library(splitstackshape)
dcast.data.table(
getanID(mydf, "trans_id"),
trans_id ~ .id, value.var = "product_id")
# trans_id 1 2 3
# 1: 1 456 778 774
# 2: 4 223 123 NA
# 3: 5 999 NA NA
The equivalent "dplyr"+"tidyr" approach would be something like:
library(dplyr)
library(tidyr)
mydf %>%
group_by(trans_id) %>%
mutate(id = sequence(n())) %>%
spread(id, product_id)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With