Fundamentals 5 min read

How to Keep Only the First Record When Times Differ by Less Than 20 Seconds in Python

This article walks through a Python automation challenge where records are grouped by several fields, sorted by end time, and only the first entry is retained when consecutive timestamps differ by 20 seconds or less, providing clear code and visual results.

Python Crawling & Data Mining

Jan 2, 2024

How to Keep Only the First Record When Times Differ by Less Than 20 Seconds in Python

1. Introduction

Hello, I am PiPi. In a Python community I was asked to solve a practical automation problem: given a table with columns 编号, 环节, 审核人, 金额, and 结束时间, group by the first four columns, sort each group by 结束时间 ascending, and keep only the first record when the time difference between consecutive rows is within 20 seconds.

2. Implementation

A user raised a follow‑up question about a case where a group contains three timestamps (2023‑11‑27 15:50:00, 15:50:05, 15:50:25). The expected output should keep the first and third timestamps, ensuring any two timestamps in the same group differ by more than 20 seconds.

To demonstrate, three additional rows were added to the sample data. The processing script was run, and the resulting output showed 3,395 rows, one more than the original 3,394 rows, confirming the logic works for the edge case.

Result comparison screenshots:

The original result contained 3,394 rows.

After the adjustment, the output correctly handled the special case.

3. Conclusion

This article presented a real‑world Python automation problem, explained the grouping and time‑difference logic, and provided a working implementation that successfully resolved the issue.

Thanks to the community members who contributed ideas and feedback. A reminder: when posting large datasets, anonymize sensitive data, include a minimal reproducible example, and attach error screenshots. For extensive code, share a .py file.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Automation data processing grouping Time Difference

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.