Structuring and Managing Data Collection Requirements with JSON and Git
By defining data collection (event tracking) requirements in a structured JSON format and storing them in Git with a web interface that abstracts version control, teams can standardize identifiers, validate data formats automatically, track changes via commit logs, and streamline collaboration between product managers, developers, and testers.
Many companies still manage data‑collection ("埋点") requirements using Word, Google Docs, or Excel, which leads to vague descriptions, lost historical versions, and poor data quality.
Requirement descriptions are ambiguous.
Historical versions become hard to locate.
Data quality suffers from errors or omissions.
If we define data‑collection requirements in a structured way, can we let programs help manage them and even automate format validation?
1. Structured Data‑Collection Requirements
We abstract the requirement into four parts:
Common data such as user_id for identifying a user, defined once and reused. event_id, a unique name for the requirement, similar to a clear variable name in code.
A description of the scenario that triggers the event, which is often the most vague part.
Event parameters, e.g., the product ID when a user clicks a product detail page.
The description (point 3) is usually the hardest to write clearly. A practical solution is to capture a screenshot of the UI interaction and annotate it with a tool like Skitch.
Once the textual parts are structured, the whole requirement can be represented as JSON, for example:
{
"event_id": "test_event",
"curent_app_version": "当期需求定义时的产品版本号",
"params": [
{
"name": "product_id",
"desc": "商品 ID"
},
{
"name": "type_id",
"desc": "类型ID"
}
],
"desc": "事件的详细描述"
}These JSON files (plus the accompanying screenshots) are then stored in Git. To hide Git’s complexity from product managers, we build a web UI that abstracts the version‑control operations.
2. Managing Requirement Documents with Git and a Visual Interface
Both JSON files and images are committed to a Git repository.
Each app version gets its own branch; a new version clones the previous branch and pushes to a new remote branch, e.g.:
git checkout 1.0 -b 1.1 && git push -u origin 1.1 && echo "新版本1.1 需求分支初始化完成"The new branch automatically inherits all previous requirements, and parallel development can merge requirements back after release.
The web UI hides these Git commands, allowing PMs to define requirements, run format validation, commit automatically, view commit logs, and share a unique URL for discussion.
This approach lets developers filter requirements by version, and testers retrieve all requirements for a new version to facilitate regression testing.
3. Automatic Format Validation Based on Structured Requirements
Because the requirements are structured, a validation tool can automatically check incoming data for:
Incorrect data format.
Parameter mismatches.
Unknown data (e.g., deprecated or typo‑generated events).
Each test datum receives a unique URL; when a bug is reported, the engineer can attach this URL, enabling developers to reproduce the exact scenario quickly.
Summary
The final system consists of three layers: requirement definition (JSON + screenshots), data collection, and data validation, all managed through Git with a user‑friendly web front‑end.
If you're not thinking about how to keep your data clean from the very beginning, you're fucked. I guarantee it.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
