Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting distribution of time data in Python using Pandas

I have a pandas dataframe with some time data which looks like

0    08:00 AM
1    08:15 AM
2    08:30 AM
3     7:45 AM
4     7:30 AM

There are 660 rows like these in total (datatype- String). I want to plot the distribution(histogram) of this column. How can I do that? Also some of the rows are just an empty strings (missing data), so I have to also handle that while plotting. What can be the best way to handle that?

I have tried to use pandas.to_datetime() to convert string to timestamp, but still after that I am stuck on how to plot distribution of those timestamps and missing data.

like image 962
visionEnthusiast Avatar asked Mar 22 '26 09:03

visionEnthusiast


1 Answers

Let's assume you have the dataframe you're talking about, and you're able to cast as pandas datetime objects:

import pandas as pd
df = pd.DataFrame(['8:00 AM', '8:15 AM', '08:30 AM', '', '7:45 AM','7:45 AM'], columns = ['time'])

df.time = pd.to_datetime(df.time)

df looks like this:

time
0   2019-08-16 08:00:00
1   2019-08-16 08:15:00
2   2019-08-16 08:30:00
3   NaT
4   2019-08-16 07:45:00
5   2019-08-16 07:45:00

I would groupby both hour and minute .

df.groupby([df['time'].dt.hour, df['time'].dt.minute]).count().plot(kind="bar")

results

like image 193
rarepup Avatar answered Mar 24 '26 00:03

rarepup



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!