Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filtering and comparing dates with Pandas

I would like to know how to filter different dates at all the different time levels, i.e. find dates by year, month, day, hour, minute and/or day. For example, how do I find all dates that happened in 2014 or 2014 in the month of January or only 2nd January 2014 or ...down to the second?

So I have my date and time dataframe generated from pd.to_datetime

df
    timeStamp
0   2014-01-02 21:03:04
1   2014-02-02 21:03:05
2   2016-02-04 18:03:10

So if I filter by the year 2014 then I would have as output:

    timeStamp
0   2014-01-02 21:03:04
1   2014-02-02 21:03:05

Or as a different example I want to know the dates that happened in 2014 and at the 2nd of each month. This would also result in:

    timeStamp
0   2014-01-02 21:03:04
1   2014-02-02 21:03:05

But if I asked for a date that happened on the 2nd of January 2014

    timeStamp
0   2014-01-02 21:03:04

How can I achieve this at all the different levels?

Also how do you compare dates at these different levels to create an array of boolean indices?

like image 714
Federico Garza Avatar asked May 24 '17 15:05

Federico Garza


Video Answer


2 Answers

You can filter your dataframe via boolean indexing like so:

df.loc[df['timeStamp'].dt.year == 2014]
df.loc[df['timeStamp'].dt.month == 5]
df.loc[df['timeStamp'].dt.second == 4]
df.loc[df['timeStamp'] == '2014-01-02']
df.loc[pd.to_datetime(df['timeStamp'].dt.date) == '2014-01-02']

... and so on and so forth.

like image 88
Andrew L Avatar answered Jan 03 '23 19:01

Andrew L


If you set timestamp as index and dtype as datetime to get a DateTimeIndex, then you can use the following Partial String Indexing syntax:

df['2014'] # gets all 2014
df['2014-01'] # gets all Jan 2014
df['01-02-2014'] # gets all Jan 2, 2014
like image 37
Scott Boston Avatar answered Jan 03 '23 20:01

Scott Boston