If you have a neat pandas Series object with unique indices, then using pd.Series.to_dict() works as you might expect it: it becomes a Python dict with each index pointing to its respective value.
This gets complicated if you have non-unique indices. My expected behavior is that values with the same index will get grouped together into a list, and the dict will have the index as a key and the list as a value. What I observe instead is a dict with the index as a key and only a single value from the Series as the value in the dict.
Is there a way to achieve my expected behavior, built into pandas, or close to it? Presently, I manually curate values that match each index into the dict in a for loop, looping over the unique index values. Is there a better way to do this?
EDIT: Here's an example:
my_series = pd.Series(['val_1', 'val_2', 'val_3', 'val_4', 'val_5'])
my_series.index = ['1', '1', '2', '2', '2']
my_series
Yields
1 val_1
1 val_2
2 val_3
2 val_4
2 val_5
dtype: object
Now, to_dict() with the 1:1 matching behavior:
my_series.to_dict()
{'1': 'val_2', '2': 'val_5'}
What I would like to see instead is:
{'1': ['val_1', 'val_2'], '2': ['val_3', 'val_4', 'val_5']}
I can achieve this doing
{idx:list(my_series[idx]) for idx in set(my_series.index)}
{'2': ['val_3', 'val_4', 'val_5'], '1': ['val_1', 'val_2']}
What I would like to know is if there is a more native way to do this in pandas, or if this is the best way to handle the problem.
Try this:
my_series.groupby(level=0).agg(list).to_dict()
Out[358]: {'1': ['val_1', 'val_2'], '2': ['val_3', 'val_4', 'val_5']}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With