Overview of the Government Procurement Cloud Self-Service Data Extraction Platform
This article introduces the self‑service data extraction platform developed by the Government Procurement Cloud, detailing its architecture, core modules such as self‑service extraction, data push, resource management, operation audit, permission controls, performance optimizations, and future development plans.
The self‑service data extraction platform was built over roughly a year to address long extraction processes, high costs, low efficiency, and uncontrolled data security, now covering most internal data‑retrieval scenarios.
It consists of four major functional modules: self‑service extraction, data push, resource management, and operation audit, with underlying data sourced from Hive tables and query engines supporting Presto and StarRocks.
Self‑service extraction includes three sub‑features: (1) Dataset – users view Hive tables they have permission to access; (2) Visual extraction – non‑SQL users can drag‑and‑drop single or multiple tables to query, export, and download data; (3) SQL extraction – advanced users write SQL directly to query, export, and download results.
Permission management provides functional permissions (users request module access) and data permissions at table‑level and field‑level; unauthorized access prompts permission requests, and sensitive fields are encrypted unless field‑level permission is granted.
Operation audit records all queries, exports, downloads, and pushes, ensuring compliance, security, and traceability, and providing data for troubleshooting and performance tuning.
Resource management lists exported files for repeated download and also allows users to upload files for subsequent push operations.
Data push enables files from the resource list to be sent to third‑party platforms, currently integrated with a cloud platform and an intelligent outbound‑call system.
Performance and usability optimizations include parallel query of total count and data, caching of query results for three hours, folder management for large numbers of extraction tasks, quick date formatting in visual extraction, and limits on query rows (500 preview, 1 000 000 export), query timeout (6 minutes), and session memory (25 GB).
Future plans aim to enhance operation audit for hot‑table recommendation, expose export/download as APIs, support scheduled extraction tasks, and integrate with visual dashboard builders, further extending the platform’s capabilities.
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.