Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pivot using multiple columns

Tags:

r

aggregate

I have a data set with 5 columns:

store_id    year    event    item    units
123         2015     sale_2   abc      2
234         2015     sale_3   def      1
345         2015     sale_2   xyz      5

I'm trying to rotate out the items by store_id, year, and event to get the sum. For instance

store_id    year    event    abc     def   xyz 
123          2015    sale_2   2       0     0
234          2015    sale_3   0       1     0
345          2015    sale_2   0       0     5    

I'm having trouble figuring out the best method. Normally I'd use dummyVars in caret to do this but I need sums instead of flag. I've looked at tapply but it can't handle more than 2 grouping variables.

Any other suggestions?

like image 349
screechOwl Avatar asked Dec 13 '25 05:12

screechOwl


1 Answers

library(reshape2)
dcast(df, store_id + year + event ~ item, fun.aggregate = sum, value.var='units')
#    store_id year  event abc def xyz
# 1:      123 2015 sale_2   2   0   0
# 2:      234 2015 sale_3   0   1   0
# 3:      345 2015 sale_2   0   0   5

For large datasets consider

# uses dcast.data.table, much faster
library(data.table)
setDT(df)
dcast(df, store_id + year + event ~ item, fun.aggregate = sum, value.var='units') 
like image 141
C8H10N4O2 Avatar answered Dec 15 '25 23:12

C8H10N4O2



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!