How Serverless Function Compute Transformed Log Processing for a FinTech Firm
Shuhe Technology replaced a cumbersome Kafka‑to‑ECS/K8s log‑processing pipeline with Alibaba Cloud Function Compute, achieving faster, more elastic, and cost‑effective handling of massive application logs while reducing operational overhead and simplifying maintenance.
Problem Statement
Application logs from multiple financial products are collected by SLS, compressed, and archived to OSS. When a specific app's logs are needed, engineers must download large batches from OSS, decompress them, and filter out the target app, leading to low timeliness and high processing cost.
Initial Container‑Based Design
A self‑built solution was considered: a program pulls data from Kafka, processes it, and runs on a Kubernetes (K8s) cluster managed by an internal release platform. The approach required developers to implement Kafka ingestion, business logic, asynchronous compression, OSS upload, and K8s scaling. It suffered from:
High development effort : all components had to be coded and integrated.
Limited elasticity : K8s scaling latency >10 seconds and concurrency limited by Kafka partition count.
Poor resource utilization : at least one pod must stay alive during off‑peak periods.
Ongoing maintenance : any change to logic, release pipeline, or scaling rules required manual updates.
Serverless Solution with Alibaba Cloud Function Compute (FC)
FC provides an event‑driven, fully managed compute environment with built‑in Kafka and timer triggers. By moving the pipeline to FC, developers focus only on business logic while the platform handles provisioning, scaling, logging, monitoring, and alerts.
Architecture Overview
The pipeline consists of three functions linked by triggers:
Data Splitting Function – Triggered by Kafka. FC batches incoming messages (up to 16 MiB per synchronous batch), groups them by the application identifier, and writes each group to a separate file in a NAS directory. This step is I/O‑intensive.
Data Compression Function – Invoked after splitting. It compresses the newly written batch, appends the compressed data to the corresponding NAS file, and uses a Redis‑based distributed lock to serialize concurrent writes to the same file. This step is compute‑intensive.
Data Upload Function – Triggered by a timer (e.g., every hour). It uploads the compressed files from NAS to the appropriate OSS path and then deletes the local copies. This step is I/O‑intensive.
Function Logic Details
Data Splitting Function : Receives a Kafka batch, parses each record, extracts the app_id field, and writes the record to /nas/app_id.log. The function runs with a modest memory profile (e.g., 256 MiB) because the workload is dominated by file I/O.
Data Compression Function : Reads the newly created app_id.log, compresses it with gzip, and appends the result to /nas/app_id.gz. Before writing, the function acquires a Redis lock keyed by app_id (max 10 concurrent locks per app) to avoid race conditions.
Data Upload Function : Scans the NAS directory for *.gz files, uploads each to oss://bucket/logs/app_id/, and removes the local file. The function is scheduled with a cron expression (e.g., 0 */1 * * *) to run hourly.
Implementation Challenges
Kafka batch size limits : Synchronous triggers allow a maximum batch of 16 MiB, while asynchronous triggers are limited to 128 KiB. To achieve efficient batching, multiple synchronous tasks were configured, which introduced some idle capacity.
Missing native file locks : NAS and OSS do not provide file‑level locking. Without coordination, concurrent writes to the same app_id file could overwrite each other. The solution uses Redis to implement a distributed lock, limiting each app to ten concurrent writers. This added modest code complexity but ensured data integrity.
Results
Since the October rollout, the serverless pipeline has:
Reduced operational overhead to near‑zero (no dedicated ECS or K8s instances).
Enabled rapid gray‑scale upgrades via FC’s multi‑version support.
Achieved lower cost thanks to Alibaba Cloud’s 2023 gradient pricing model for Function Compute.
Future Work
Planned improvements include:
Requesting larger asynchronous batch limits to reduce the number of synchronous tasks.
Advocating for native file‑lock support in NAS/OSS to simplify concurrency handling.
Migrating additional suitable workloads to FC to further consolidate serverless operations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
