I am testing itertools.groupby() and try to get the groups as lists but can't figure out how to make it work.
using the examples here, in How do I use Python's itertools.groupby()?
from itertools import groupby
things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
("vehicle", "speed boat"), ("vehicle", "school bus")]
I tried (python 3.5):
g = groupby(things, lambda x: x[0])
ll = list(g)
list(tuple(ll[0])[1])
I thought I should get the first group ("animal") as a list ['bear', 'duck']. But I just get an empty list on REPL.
What am I doing wrong?
How should I extract all three groups as lists?
If you just want the groups, without the keys, you need to realize the group generators as you go, per the docs:
Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list.
This means that when you try to list-ify the groupby generator first using ll = list(g), before converting the individual group generators, all but the last group generator will be invalid/empty.
(Note that list is just one option; a tuple or any other container works too).
So to do it properly, you'd make sure to listify each group generator before moving on to the next:
from operator import itemgetter # Nicer than ad-hoc lambdas
# Make the key, group generator
gen = groupby(things, key=itemgetter(0))
# Strip the keys; you only care about the group generators
# In Python 2, you'd use future_builtins.map, because a non-generator map would break
groups = map(itemgetter(1), gen)
# Convert them to list one by one before the next group is pulled
groups = map(list, groups)
# And listify the result (to actually run out the generator and get all your
# results, assuming you need them as a list
groups = list(groups)
As a one-liner:
groups = list(map(list, map(itemgetter(1), groupby(things, key=itemgetter(0)))))
or because this many maps gets rather ugly/non-Pythonic, and list comprehensions let us do nifty stuff like unpacking to get named values, we can simplify to:
groups = [list(g) for k, g in groupby(things, key=itemgetter(0))]
You could use a list comprehension as follows:
from itertools import groupby
things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
("vehicle", "speed boat"), ("vehicle", "school bus")]
g = groupby(things, lambda x: x[0])
answer = [list(group[1]) for group in g]
print(answer)
Output
[[('animal', 'bear'), ('animal', 'duck')],
[('plant', 'cactus')],
[('vehicle', 'speed boat'), ('vehicle', 'school bus')]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With