I have a column of data one of them being a date and am expected to drop the rows that have leap dates. It is a range of years so I was hoping to drop any that matched the 02-29 filter.
The one way I used is to add additional columns, extract the month and date separately and then filter on the data as shown below. It serves the purpose but obviously not good from an efficiency perspective
df['Yr'], df['Mth-Dte'] = zip(*df['Date'].apply(lambda x: (x[:4], x[5:])))
df = df[df['Mth-Dte'] != '02-29']
Is there a better way to implement this by directly applying the filter on the column in the dataframe?
Adding the data
ID Date
22398 IDM00096087 1/1/2005
22586 IDM00096087 1/1/2005
21790 IDM00096087 1/2/2005
21791 IDM00096087 1/2/2005
14727 IDM00096087 1/3/2005
Thanks in advance
Convert to datetime and use boolean mask.
import pandas as pd
data = {'Date': {14727: '1/3/2005',
21790: '1/2/2005',
21791: '1/2/2005',
22398: '1/1/2005',
22586: '29/2/2008'},
'ID': {14727: 'IDM00096087',
21790: 'IDM00096087',
21791: 'IDM00096087',
22398: 'IDM00096087',
22586: 'IDM00096087'}}
df = pd.DataFrame(data)
Option1, convert + dt:
df.Date = pd.to_datetime(df.Date)
# Filter away february 29
df[~((df.Date.dt.month == 2) & (df.Date.dt.day == 29))] # ~ for not equal to
Option2, convert + strftime:
df.Date = pd.to_datetime(df.Date)
# Filter away february 29
df[df.Date.dt.strftime('%m%d') != '0229']
Option3, without conversion:
mask = pd.to_datetime(df.Date).dt.strftime('%m%d') != '0229'
df[mask]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With