I would like to summarise or aggregate tables without dropping empty levels. I wonder if anyone has any ideas on this?
As an example, Here is a data frame
df1<-data.frame(Method=c(rep("A",3),rep("B",2),rep("C",4)),
Type=c("Fast","Fast","Medium","Fast","Slow","Fast","Medium","Slow","Slow"),
Measure=c(1,1,2,1,3,1,1,2,2))
Two approaches using base and doBy package.
#base
aggregate(Measure~Method+Type,data=df1,FUN=length)
require(doBy)
summaryBy(Measure~Method+Type,data=df1,FUN=length)
They both give the same results sorted differently, but the issue is that I would like all combinations of Method and Type and missing measures inserted as NAs. Or all levels of both my factors must be maintained.
df1$Type
df1$Method
Maybe plyr has something, but I don't know how that works.
Have a look at tapply:
with(df1, tapply(Measure, list(Method, Type), FUN = length))
# Fast Medium Slow
# A 2 1 NA
# B 1 NA 1
# C 1 1 2
Update for 2021
I think this can be accomplished now with stats::aggregate() using drop = FALSE. No extra packages needed. The result is a regular ole dataframe where empty levels are NA.
aggregate(Measure ~ Method + Type, data = df1, FUN = length, drop = FALSE)
Method Type Measure
1 A Fast 2
2 B Fast 1
3 C Fast 1
4 A Medium 1
5 B Medium NA
6 C Medium 1
7 A Slow NA
8 B Slow 1
9 C Slow 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With