python count dataframe column values meeting condition

Question

What would be a more elegant way to writing:

df[df['income'] > 0].count()['income']

I would like to simply count the number of column values meeting a condition (in this example, the condition is just being larger than zero, but I would like a way applicable to any arbitrary condition or set of conditions). Obviously more elegant if the column name would not need to show up twice in the expression. Should be hopefully easy.

fokkerplanck · Accepted Answer

df = pd.DataFrame([0, 30000, 75000, -300, 23000], columns=['income'])
print(df)
   income
0       0
1   30000
2   75000
3    -300
4   23000

If you would like to count values in a column meeting a slightly more complex condition than just being positive, for example "value is in the range from 5000 to 25000", you can use two methods.

First, using boolean indexing,

((df['income'] > 5000) & (df['income'] < 25000)).sum()

Second, applying a function on every row of the series,

df['income'].map(lambda x: 5000 < x < 25000).sum()

Note that the second approach allows arbitrarily complex condition but is much slower than the first approach which is using low-level operations on the underlying arrays. See the documentation on boolean indexing for more information.

python count dataframe column values meeting condition

Tags:

python

python-3.x

pandas

matanster

1 Answers

fokkerplanck

Recent Activity

Donate For Us

python count dataframe column values meeting condition

Tags:

python

python-3.x

pandas

matanster

1 Answers

fokkerplanck

Related questions

Recent Activity

Donate For Us