Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Customize names of columns created by dcast.data.table

I am new to reshape2 and data.table and trying to learn the syntax.

I have a data.table that I want to cast from multiple rows per grouping variable(s) to one row per grouping variable(s). For simplicity, let's make it a table of customers, some of whom share addresses.

library(data.table)

# Input table:
cust <- data.table(name=c("Betty","Joe","Frank","Wendy","Sally"),
                   address=c(rep("123 Sunny Rd",2), 
                             rep("456 Cloudy Ln",2),
                                 "789 Windy Dr"))

I want the output to have the following format:

# Desired output looks like this:
(out <- data.table(address=c("123 Sunny Rd","456 Cloudy Ln","789 Windy Dr"),
                   cust_1=c("Betty","Frank","Sally"),
                   cust_2=c("Joe","Wendy",NA)) )

#          address cust_1 cust_2
# 1:  123 Sunny Rd  Betty    Joe
# 2: 456 Cloudy Ln  Frank  Wendy
# 3:  789 Windy Dr  Sally     NA

I would like columns for cust_1...cust_n where n is the max customers per address. I don't really care about the order--whether Joe is cust_1 and Betty is cust_2 or vice versa.

like image 745
C8H10N4O2 Avatar asked Dec 06 '25 07:12

C8H10N4O2


1 Answers

Just pushed a commit to data.table v1.9.5. dcast now

  • allows casting on multiple value.var columns and multiple fun.aggregate functions
  • understands undefined variables/expressions in formula

With this, we can do:

dcast(cust, address ~ paste0("cust", cust[, seq_len(.N), 
          by=address]$V1), value.var="name")
#          address cust1 cust2
# 1:  123 Sunny Rd Betty   Joe
# 2: 456 Cloudy Ln Frank Wendy
# 3:  789 Windy Dr Sally    NA
like image 192
Arun Avatar answered Dec 08 '25 22:12

Arun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!