I have the following data prepared
Timestamp Weighted Value SumVal Group
1 1600 800 1
2 1000 1000 2
3 1000 1000 2
4 1000 1000 2
5 800 500 3
6 400 500 3
7 2000 800 4
8 1200 1000 4
I want to calculate for each group sum(Weighted_Value)/sum(SumVal), so for example for Group 3 the result would be 1.2.
I was using period.apply to do that:
period.apply(x4, intervalIndex, function(z) sum(z[,4])/sum(z[,2]))
But it's too slow for my application, so I wanted to ask if someone knows a faster alternative for that? I alsready tried ave, but it seems to be even slower.
My goal is btw. to calculate a time-weighted-average, to transfer an irregular time series into a time series with equi-distant-time intervals.
Thanks!
Using rowsum seems to be faster (at least for this small example dataset) than the data.table approach:
sgibb <- function(datframe) {
data.frame(Group = unique(df$Group),
Avg = rowsum(df$Weighted_Value, df$Group)/rowsum(df$SumVal, df$Group))
}
Adding the rowsum approach to @platfort's benchmark:
library(microbenchmark)
library(dplyr)
library(data.table)
microbenchmark(
Nader = df %>%
group_by(Group) %>%
summarise(res = sum(Weighted_Value) / sum(SumVal)),
Henk = setDT(df)[, sum(Weighted_Value) / sum(SumVal), by = Group],
plafort = weight.avg(df),
sgibb = sgibb(df)
)
# Unit: microseconds
# expr min lq mean median uq max neval
# Nader 2179.890 2280.462 2583.8798 2399.0885 2497.6000 6647.236 100
# Henk 648.191 693.519 788.1421 726.0940 751.0810 2386.260 100
# plafort 2638.967 2740.541 2935.4756 2785.7425 2909.4640 5000.652 100
# sgibb 347.125 384.830 442.6447 409.2815 441.8935 2039.563 100
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With