Search Missing Timestamp and display in python?

Question

Here is my some Dataset which having Time,Temperature1,Temperature2

Timestamp.             Temperature1.        Temperature2
09/01/2016 00:00:08          53.4.                       45.5
09/01/2016 00:00:38.         53.5.                       45.2
09/01/2016 00:01:08.         54.6.                        43.2
09/01/2016 00:01:38.         55.2.                        46.3
09/01/2016 00:02:08.         54.5.                        45.5
09/01/2016 00:04:08.         54.2.                       35.5
09/01/2016 00:05:08.         52.4.                        45.7
09/01/2016 00:05:38.         53.4.                         45.2

My data is coming in every 30 second..

This is my dataset here..some Timestamp is missing..bcoz. every 30 second my data is coming..so some data points are missing.. How to find that data points..and insert there data as NAN ... Please help me..

MaxU - stop WAR against UA · Accepted Answer

you can use resample('30S', base=8) method:

In [20]: x.resample('30S', base=8).mean()
Out[20]:
                                 Temperature1    Temperature2
Timestamp
2016-09-01 00:00:08                      53.4            45.5
2016-09-01 00:00:38                      53.5            45.2
2016-09-01 00:01:08                      54.6            43.2
2016-09-01 00:01:38                      55.2            46.3
2016-09-01 00:02:08                      54.5            45.5
2016-09-01 00:02:38                       NaN             NaN
2016-09-01 00:03:08                       NaN             NaN
2016-09-01 00:03:38                       NaN             NaN
2016-09-01 00:04:08                      54.2            35.5
2016-09-01 00:04:38                       NaN             NaN
2016-09-01 00:05:08                      52.4            45.7
2016-09-01 00:05:38                      53.4            45.2

The solution above assumes that the Timestamp is of datetime dtype and that it has been set as index. If Timestamp is a regular column (not index), then starting from Pandas 0.19.0 we can resample on regular columns (it must be of datetime dtype), using on='column_name' parameter:

In [26]: x.resample('30S', on='Timestamp', base=8).mean()
Out[26]:
                                 Temperature1    Temperature2
Timestamp
2016-09-01 00:00:08                      53.4            45.5
2016-09-01 00:00:38                      53.5            45.2
2016-09-01 00:01:08                      54.6            43.2
2016-09-01 00:01:38                      55.2            46.3
2016-09-01 00:02:08                      54.5            45.5
2016-09-01 00:02:38                       NaN             NaN
2016-09-01 00:03:08                       NaN             NaN
2016-09-01 00:03:38                       NaN             NaN
2016-09-01 00:04:08                      54.2            35.5
2016-09-01 00:04:38                       NaN             NaN
2016-09-01 00:05:08                      52.4            45.7
2016-09-01 00:05:38                      53.4            45.2

if you need to find your base value dynamically you can do it this way:

In [21]: x.index[0].second
Out[21]: 8

from docs:

base : int, default 0

For frequencies that evenly subdivide 1 day, the “origin” of the aggregated intervals. For example, for 5min frequency, base could range from 0 through 4.

Defaults to 0

EdChum · Answer

Assuming that the timestamps have been converted to datetime, if you set the index to the timestamp column and then reindex with a date range then the missing values will show up:

In [94]:
df['Timestamp'] = pd.to_datetime(df['Timestamp'])
df = df.set_index('Timestamp')
df

Out[94]:
                     Temperature1  Temperature2
Timestamp                                      
2016-09-01 00:00:08          53.4          45.5
2016-09-01 00:00:38          53.5          45.2
2016-09-01 00:01:08          54.6          43.2
2016-09-01 00:01:38          55.2          46.3
2016-09-01 00:02:08          54.5          45.5
2016-09-01 00:04:08          54.2          35.5
2016-09-01 00:05:08          52.4          45.7
2016-09-01 00:05:38          53.4          45.2

In [96]:    
df.reindex(pd.date_range(start=df.index[0], end=df.index[-1], freq='30s'))

Out[96]:
                     Temperature1  Temperature2
2016-09-01 00:00:08          53.4          45.5
2016-09-01 00:00:38          53.5          45.2
2016-09-01 00:01:08          54.6          43.2
2016-09-01 00:01:38          55.2          46.3
2016-09-01 00:02:08          54.5          45.5
2016-09-01 00:02:38           NaN           NaN
2016-09-01 00:03:08           NaN           NaN
2016-09-01 00:03:38           NaN           NaN
2016-09-01 00:04:08          54.2          35.5
2016-09-01 00:04:38           NaN           NaN
2016-09-01 00:05:08          52.4          45.7
2016-09-01 00:05:38          53.4          45.2

This assumes that the timestamps are regular, here we construct a date range using the timestamp first and last values with a frequency of 30 seconds:

In [97]:
pd.date_range(start=df.index[0], end=df.index[-1], freq='30s')

Out[97]:
DatetimeIndex(['2016-09-01 00:00:08', '2016-09-01 00:00:38',
               '2016-09-01 00:01:08', '2016-09-01 00:01:38',
               '2016-09-01 00:02:08', '2016-09-01 00:02:38',
               '2016-09-01 00:03:08', '2016-09-01 00:03:38',
               '2016-09-01 00:04:08', '2016-09-01 00:04:38',
               '2016-09-01 00:05:08', '2016-09-01 00:05:38'],
              dtype='datetime64[ns]', freq='30S')

When you reindex with this, any missing index labels become NaN values

Search Missing Timestamp and display in python?

Tags:

python

pandas

TB.M

2 Answers

MaxU - stop WAR against UA

EdChum

Recent Activity

Donate For Us

Search Missing Timestamp and display in python?

Tags:

python

pandas

TB.M

2 Answers

MaxU - stop WAR against UA

EdChum

Related questions

Recent Activity

Donate For Us