Python Pandas Dataframe deletes rows by Timedelta column value
I’m trying to remove rows from a data frame with a timedelta value less than a certain number of seconds.
My data frame looks like this :
Start Elapsed time
0 2018-10-29 07:56:20 0 days 00:15:05
1 2018-10-29 07:56:20 0 days 00:15:05
2 2018-10-29 08:11:25 0 days 00:00:02
3 2018-10-29 08:11:27 0 days 00:00:08
4 2018-10-29 08:11:27 0 days 00:00:08
5 2018-10-29 08:11:35 0 days 00:00:02
6 2018-10-29 08:11:37 0 days 00:00:00
I want to delete all rows that take less than a certain number of seconds – now let’s say 3. So I want a data frame that looks like this (from above) :
Start Elapsed time
0 2018-10-29 07:56:20 0 days 00:15:05
1 2018-10-29 07:56:20 0 days 00:15:05
3 2018-10-29 08:11:27 0 days 00:00:08
4 2018-10-29 08:11:27 0 days 00:00:08
I’ve tried a lot of different things, resulting in a lot of different error messages – usually incompatible type comparison errors. For example:
df_new = df[df['Elapsed time'] > pd.to_timedelta('3 seconds')]
df_new = df[df['Elapsed time'] > datetime.timedelta(seconds=3)]
I
want to avoid going through all the lines, but if that’s what I have to do, then I’ll do it.
Thank you very much for your help!
EDIT: My real problem is that the dtype of my Eradiated Time column is an object instead of a timedelta. A quick fix is to convert dtype using the code below, but a better fix is to first make sure that dtype is not set to an object type. Thank you all for your help and comments.
df_new = df[pd.to_timedelta(df['Elapsed time']) > pd.to_timedelta('3 seconds')]
Solution
Use pd.read_clipboard (sep=’\s\s+) to get data
df = pd.read_clipboard(sep='\s\s+')
df['Elapsed time'] = pd.to_timedelta(df['Elapsed time'])
You can use:
df[df['Elapsed time'].dt.total_seconds() > 3]
Output:
Start Elapsed time
0 2018-10-29 07:56:20 00:15:05
1 2018-10-29 07:56:20 00:15:05
3 2018-10-29 08:11:27 00:00:08
4 2018-10-29 08:11:27 00:00:08