Let's say I have the following dataframe:
df = pd.DataFrame({'a': [10, 20, 30, 40, 50], 'b': [0, 10, 40, 45, 50]}, columns = ['a', 'b'])
I would like to make a list of indices where:
a [i - 1] < b[i] and a[i] >= b[i]
in order to detect when a value, in a timeseries, is crossing another one
is there a Pandas idiomatic way to achieve this without iterating through all the elements?
I tried to create a new column with flags to indicate a crossing by doing this:
df['t'] = (df['a'].shift(1).values < df['b'].values and di['a'].values >= df['b']).astype(bool)
but that won't compile. I'm not sure how to approach this problem, short of doing a loop through all the elements.
You can use the Series.shift
with Series.lt
which is "less than", same as <
and Series.ge
which is "greater than or equal" and is same as >=
:
mask = df['a'].shift().lt(df['b']) & df['a'].ge(df['b'])
# same as (df['A'].shift() < df['b']) & (df['a'] >= df['b'])
0 False
1 False
2 False
3 False
4 True
dtype: bool
Notice, we don't have to specify astype(bool)
, pandas works with boolean indexing
and returns booleans
when defining conditions.
To get the indices
of the rows with True
, use:
idx = df[mask].index.tolist()
print(idx)
[4]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With