Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare two date columns and validate if DateA is earlier than DateB

Tags:

python

I have two columns with dates(mm/dd/yy). I need to validate if DateColumn_A comes before DateColumn_B.

I used the following code and I got this error "TypeError: must be string, not Series". Please help a beginner.

Code:

Column_A = datetime.strptime(df['DateColumn_A'], '%m %d %y')
Column_B = datetime.strptime(df['DateColumn_B'], '%m %d %y')

for index, row in dataframe.iterrows():
    if row[Column_A] < row[Column_B]
        print (index,row[Column_A])
    else:
        pass

2 Answers

You can compare them like this:

from datetime import datetime

Column_A = datetime.strptime(df['DateColumn_A'], '%m %d %y').date()
Column_B = datetime.strptime(df['DateColumn_B'], '%m %d %y').date()
diff = Column_A - Column_B
if diff > 0:
    # Column_A is greater than Column_B
else:
    # Column_B is greater than Column_A
like image 102
Coder the Great Avatar answered Sep 10 '25 06:09

Coder the Great


Elaborating on my comment above with an example.

First make sure the date columns you are comparing are actually dates. You can do that using the pandas to_datetime function like so:

>>> df = df.apply(pd.to_datetime, errors='ignore')
>>> df.DateColumnA
0   2018-01-01
1   2018-05-01
Name: DateColumnA, dtype: datetime64[ns]

The below snippet is using boolean indexing. So df['DateColumnA'] < df['DateColumnB'] returns a series of Trues and Falses. And then df.loc[df['DateColumnA'] < df['DateColumnB']] is akin to saying "Give me the subset of the DataFrame where this condition is True"

 >>> df
  DateColumnA DateColumnB
0  2018-01-01  2018-02-01
1  2018-05-01  2018-01-01
>>> df['DateColumnA'] < df['DateColumnB']
0     True
1    False
dtype: bool
>>> df.loc[df['DateColumnA'] < df['DateColumnB']]
  DateColumnA DateColumnB
0  2018-01-01  2018-02-01
like image 23
Orenshi Avatar answered Sep 10 '25 05:09

Orenshi