I am looking for an efficient way to remove zeros from a list of dictionaries created from a pd.DataFrame
Take the following example:
df = pd.DataFrame([[1, 2], [0, 4]], columns=['a', 'b'], index=['x', 'y'])
df.to_dict('records')
[{'a': 1, 'b': 2}, {'a': 0, 'b': 4}]
What I would like is:
[{'a': 1, 'b': 2}, {'b': 4}]
I have a very large sparse dataframe, storing all of the zeros is inefficient. Because the dataframe is large I am looking for a faster solution than looping through the data frame of dictionaries and removing zeros, for instance the following works but is very slow and uses large amounts of memory.
new_records = []
for record in df.to_dict('records'):
new_records.append(dict((k, v) for k, v in record.items() if v))
Is there a more efficient method or approach to this?
use a list comprehension
[r[r != 0].to_dict() for _, r in df.iterrows()]
[{'a': 1, 'b': 2}, {'b': 4}]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With