Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Groupby statement

I mam trying to group the following details list:

details = [('20130325','B'), ('20130320','A'), ('20130325','B'), ('20130320','A')]

>>for k,v in itertools.groupby(details,key=operator.itemgetter(0)):
>>  print k,list(v)

And this is the output with the above groupby statement:

20130325 [('20130325', 'B')]

20130320 [('20130320', 'A')]

20130325 [('20130325', 'B')]

20130320 [('20130320', 'A')]

But my expected output was:

20130325 [('20130325', 'B'),('20130325', 'B')]

20130320 [('20130320', 'A'),('20130320', 'A')]

Am I doing wrong somewhere?

like image 728
maruthi reddy Avatar asked Dec 31 '25 13:12

maruthi reddy


2 Answers

You have to sort your details first:

details.sort(key=operator.itemgetter(0))

or

fst = operator.itemgetter(0)
itertools.groupby(sorted(details, key=fst), key=fst)

 

Groupby groups consecutive matching records together.

Documentation:

The operation of groupby() is similar to the uniq filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.

like image 178
Pavel Anossov Avatar answered Jan 02 '26 05:01

Pavel Anossov


The toolz project offers a non-streaming groupby

$ pip install toolz
$ ipython

In [1]: from toolz import groupby, first

In [2]: details = [('20130325','B'), ('20130320','A'), ('20130325','B'), ('20130320','A')]

In [3]: groupby(first, details)
Out[3]: 
{'20130320': [('20130320', 'A'), ('20130320', 'A')],
 '20130325': [('20130325', 'B'), ('20130325', 'B')]}
like image 31
MRocklin Avatar answered Jan 02 '26 03:01

MRocklin



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!