Given following input, the goal is to group values by hour for each Date with Avg and Sum functions. Solution to grouping it by hour is here, but it does not consider new days.
Date Time F1 F2 F3
21-01-16 8:11 5 2 4
21-01-16 9:25 9 8 2
21-01-16 9:39 7 3 2
21-01-16 9:53 6 5 1
21-01-16 10:07 4 6 7
21-01-16 10:21 7 3 1
21-01-16 10:35 5 6 7
21-01-16 11:49 1 2 1
21-01-16 12:03 3 3 1
22-01-16 9:45 6 5 1
22-01-16 9:20 4 6 7
22-01-16 12:10 7 3 1
Expected output:
Date,Time,SUM F1,SUM F2,SUM F3,AVG F1,AVG F2,AVG F3
21-01-16,8:00,5,2,4,5,2,4
21-01-16,9:00,22,16,5,7.3,5.3,1.6
21-01-16,10:00,16,15,15,5.3,5,5
21-01-16,11:00,1,2,1,1,2,1
21-01-16,12:00,3,3,1,3,3,1
22-01-16,9:00,10,11,8,5,5.5,4
22-01-16,12:00,7,3,1,7,3,1
You can do the parsing of dates during reading of the csv file:
from __future__ import print_function # make it work with Python 2 and 3
df = pd.read_csv('f123_dates.csv', index_col=0, parse_dates=[0, 1],
delim_whitespace=True)
print(df.groupby([df.index, df.Time.dt.hour]).agg(['mean','sum']))
Output:
F1 F2 F3
mean sum mean sum mean sum
Date Time
2016-01-21 8 5.000000 5 2.000000 2 4.000000 4
9 7.333333 22 5.333333 16 1.666667 5
10 5.333333 16 5.000000 15 5.000000 15
11 1.000000 1 2.000000 2 1.000000 1
12 3.000000 3 3.000000 3 1.000000 1
2016-01-22 9 5.000000 10 5.500000 11 4.000000 8
12 7.000000 7 3.000000 3 1.000000 1
All the way into csv:
from __future__ import print_function
df = pd.read_csv('f123_dates.csv', index_col=0, parse_dates=[0, 1],
delim_whitespace=True)
df2 = df.groupby([df.index, df.Time.dt.hour]).agg(['mean','sum'])
df3 = df2.reset_index()
df3.columns = [' '.join(col).strip() for col in df3.columns.values]
print(df3.to_csv(columns=df3.columns, index=False))
Output:
Date,Time,F1 mean,F1 sum,F2 mean,F2 sum,F3 mean,F3 sum
2016-01-21,8,5.0,5,2.0,2,4.0,4
2016-01-21,9,7.333333333333333,22,5.333333333333333,16,1.6666666666666667,5
2016-01-21,10,5.333333333333333,16,5.0,15,5.0,15
2016-01-21,11,1.0,1,2.0,2,1.0,1
2016-01-21,12,3.0,3,3.0,3,1.0,1
2016-01-22,9,5.0,10,5.5,11,4.0,8
2016-01-22,12,7.0,7,3.0,3,1.0,1
You cas use convert time to datetime by to_datetime and then groupby with agg:
print df
Date Time F1 F2 F3
0 2016-01-21 8:11 5 2 4
1 2016-01-21 9:25 9 8 2
2 2016-01-21 9:39 7 3 2
3 2016-01-21 9:53 6 5 1
4 2016-01-21 10:07 4 6 7
5 2016-01-21 10:21 7 3 1
6 2016-01-21 10:35 5 6 7
7 2016-01-21 11:49 1 2 1
8 2016-01-21 12:03 3 3 1
9 2016-01-22 9:45 6 5 1
10 2016-01-22 9:20 4 6 7
11 2016-01-22 12:10 7 3 1
df['Time'] = pd.to_datetime(df['Time'], format="%H:%M")
print df
Date Time F1 F2 F3
0 2016-01-21 1900-01-01 08:11:00 5 2 4
1 2016-01-21 1900-01-01 09:25:00 9 8 2
2 2016-01-21 1900-01-01 09:39:00 7 3 2
3 2016-01-21 1900-01-01 09:53:00 6 5 1
4 2016-01-21 1900-01-01 10:07:00 4 6 7
5 2016-01-21 1900-01-01 10:21:00 7 3 1
6 2016-01-21 1900-01-01 10:35:00 5 6 7
7 2016-01-21 1900-01-01 11:49:00 1 2 1
8 2016-01-21 1900-01-01 12:03:00 3 3 1
9 2016-01-22 1900-01-01 09:45:00 6 5 1
10 2016-01-22 1900-01-01 09:20:00 4 6 7
11 2016-01-22 1900-01-01 12:10:00 7 3 1
df = df.groupby([df['Date'], df['Time'].dt.hour]).agg(['mean','sum']).reset_index()
print df
Date Time F1 F2 F3
mean sum mean sum mean sum
0 2016-01-21 8 5.000000 5 2.000000 2 4.000000 4
1 2016-01-21 9 7.333333 22 5.333333 16 1.666667 5
2 2016-01-21 10 5.333333 16 5.000000 15 5.000000 15
3 2016-01-21 11 1.000000 1 2.000000 2 1.000000 1
4 2016-01-21 12 3.000000 3 3.000000 3 1.000000 1
5 2016-01-22 9 5.000000 10 5.500000 11 4.000000 8
6 2016-01-22 12 7.000000 7 3.000000 3 1.000000 1
And then you can set column names by list comprehension:
levels = df.columns.levels
labels = df.columns.labels
df.columns = [ x + " " + y for x, y in zip(levels[0][labels[0]],df.columns.droplevel(0))]
print df
Date Time F1 mean F1 sum F2 mean F2 sum F3 mean F3 sum
0 2016-01-21 8 5.000000 5 2.000000 2 4.000000 4
1 2016-01-21 9 7.333333 22 5.333333 16 1.666667 5
2 2016-01-21 10 5.333333 16 5.000000 15 5.000000 15
3 2016-01-21 11 1.000000 1 2.000000 2 1.000000 1
4 2016-01-21 12 3.000000 3 3.000000 3 1.000000 1
5 2016-01-22 9 5.000000 10 5.500000 11 4.000000 8
6 2016-01-22 12 7.000000 7 3.000000 3 1.000000 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With