For a current project, I am planning to filter a JSON file by timeranges by running several loops, each time with a slightly shifted range. The code below however yields the error TypeError: Invalid comparison between dtype=datetime64[ns] and date for line after_start_date = df["Date"] >= start_date.
I have already tried to modify the formatting of the dates both within the Python code as well as the corresponding JSON file. Is there any smart tweak to align the date types/formats?
The JSON file has the following format:
[
{"No":"121","Stock Symbol":"A","Date":"05/11/2017","Text Main":"Sample text"}
]
And the corresponding code looks like this:
import string
import json
import pandas as pd
import datetime
from dateutil.relativedelta import *
# Loading and reading dataset
file = open("Glassdoor_A.json", "r")
data = json.load(file)
df = pd.json_normalize(data)
df['Date'] = pd.to_datetime(df['Date'])
# Create an empty dictionary
d = dict()
# Filtering by date
start_date = datetime.date.fromisoformat('2017-01-01')
end_date = datetime.date.fromisoformat('2017-01-31')
for i in df.iterrows():
    start_date += relativedelta(months=+3)
    end_date += relativedelta(months=+3)
    print(start_date)
    print(end_date)
    after_start_date = df["Date"] >= start_date
    before_end_date = df["Date"] <= end_date
    between_two_dates = after_start_date & before_end_date
    filtered_dates = df.loc[between_two_dates]
    print(filtered_dates)
You can use pd.to_datetime('2017-01-31') instead of datetime.date.fromisoformat('2017-01-31').
I hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With