How can I get the value_counts above a threshold? I tried
df[df[col].value_counts(dropna=False) > 3]
to get all counts greater than 3, but I am getting
IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
Any hint? Thanks
Try:
df[df.groupby(col)[col].transform('size')>3]
Or with value_counts:
counts = df[col].value_counts(dropna=False)
valids = counts[counts>3].index
df[df[col].isin(valids)]
Another approach with value_counts and map:
counts = df[col].value_counts(dropna=False)
df[df[col].map(counts)>3]
Try with isin and chain with your original value_counts
out = df[df.col.isin(df[col].value_counts(dropna=False).loc[lambda x : x>3].index)].copy()
Also Let us try filter
out = df.groupby(col).filter(lambda x : len(x)>3)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With