I am looking for a solution to add rows to a dataframe. Here is the data I have : A grouped object ( obtained by grouping a dataframe on month and year i.e in this grouped object key is [month,year] and value is all the rows / dates in that month and year).
I want to extract all the month , year combinations and put that in a new dataframe. Issue : When I iterate over the grouped object, month, row is a tuple, so I converted the tuple into a list and added it to a dataframe using thye append command. Instead of getting added as rows : 1 2014 2 2014 3 2014 it got added in one column 0 1 1 2014 0 2 1 2014 0 3 1 2014 ...
I want to store these values in a new dataframe. Here is how I want the new dataframe to be : month year 1 2014 2 2014 3 2014
I tried converting the tuple to list and then I tried various other things like pivoting. Inputs would be really helpful.
Here is the sample code :
df=df.groupby(['month','year'])
df = pd.DataFrame()
for key, value in df:
print "type of key is:",type(key)
print "type of list(key) is:",type(list(key))
df = df.append(list(key))
print df
When you do the groupby the resulting MultiIndex is available as:
In [11]: df = pd.DataFrame([[1, 2014, 42], [1, 2014, 44], [2, 2014, 23]], columns=['month', 'year', 'val'])
In [12]: df
Out[12]:
month year val
0 1 2014 42
1 1 2014 44
2 2 2014 23
In [13]: g = df.groupby(['month', 'year'])
In [14]: g.grouper.result_index
Out[14]:
MultiIndex(levels=[[1, 2], [2014]],
labels=[[0, 1], [0, 0]],
names=['month', 'year'])
Often this will be sufficient, and you won't need a DataFrame. If you do, one way is the following:
In [21]: pd.DataFrame(index=g.grouper.result_index).reset_index()
Out[21]:
month year
0 1 2014
1 2 2014
I thought there was a method to get this, but can't recall it.
If you really want the tuples you can use .values or to_series:
In [31]: g.grouper.result_index.values
Out[31]: array([(1, 2014), (2, 2014)], dtype=object)
In [32]: g.grouper.result_index.to_series()
Out[32]:
month year
1 2014 (1, 2014)
2 2014 (2, 2014)
dtype: object
You had initially declared both the groupby and empty dataframe as df. Here's a modified version of your code that allows you to append a tuple as a dataframe row.
g=df.groupby(['month','year'])
df = pd.DataFrame()
for (key1,key2), value in g:
row_series = pd.Series((key1,key),index=['month','year'])
df = df.append(row_series, ignore_index = True)
print df
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With