Big Data 22 min read

How Serverless Architecture Supercharges Game Data Collection and Scaling

This article explains how to build a highly scalable, cost‑effective game data collection pipeline using Serverless function compute, Kafka, and cloud services, covering architecture design, function implementation, deployment with Fun, Kafka configuration, and performance testing to handle massive traffic spikes.

Alibaba Cloud Native

Feb 8, 2021

How Serverless Architecture Supercharges Game Data Collection and Scaling

Game publishing in China generates billions of yuan in revenue, and the industry relies heavily on user acquisition ("buying traffic") to attract players. Massive, irregular traffic spikes during marketing campaigns put extreme pressure on data collection systems, requiring a solution that can automatically scale without manual provisioning.

Traditional Data Collection Architecture

The classic architecture exposes an HTTP endpoint that writes incoming data directly to a database. It suffers from two main challenges during traffic bursts:

Inability to quickly scale out when a traffic pulse arrives, leading to data loss.

Resource waste when the anticipated traffic does not materialize, because fixed nodes remain idle.

These problems force operations teams to guess capacity ahead of time, often resulting in either overload or under‑utilization.

Serverless Solution Overview

By replacing the HTTP‑back‑end with Alibaba Cloud Function Compute (FC) we can leverage its millisecond‑level elasticity. The architecture consists of two functions:

receiveData – an HTTP‑triggered function that only receives the raw payload.

dataToKafka – a processing function that forwards the payload to a Kafka topic and stores it in RDS.

FC automatically provisions instances on demand, eliminating the need for pre‑allocated servers and reducing cost to zero when no traffic is present.

Function 1: Receive Data

# -*- coding: utf-8 -*-
import logging
import json
import urllib.parse
HELLO_WORLD = b'Hello world!
'

def handler(environ, start_response):
    logger = logging.getLogger()
    request_body_size = int(environ.get('CONTENT_LENGTH', 0))
    request_body = environ['wsgi.input'].read(request_body_size)
    request_body_str = urllib.parse.unquote(request_body.decode('GBK'))
    request_body_obj = json.loads(request_body_str)
    logger.info(request_body_obj["action"])
    logger.info(request_body_obj["articleAuthorId"])
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return [HELLO_WORLD]

This function is configured with:

Function type: HTTP trigger

Name: receiveData Runtime: Python 3

Memory: 512 MB, Timeout: 60 s, Concurrency: 1

Anonymous authentication, supporting GET/POST

It simply logs the incoming fields and returns a static response, allowing rapid verification via the console logs and request tracing.

Function 2: Process Data and Send to Kafka

# -*- coding: utf-8 -*-
import logging
import json
import urllib.parse
from kafka import KafkaProducer
producer = None

def my_initializer(context):
    logger = logging.getLogger()
    logger.info("init kafka producer")
    global producer
    producer = KafkaProducer(bootstrap_servers='XX.XX.XX.XX:9092,XX.XX.XX.XX:9092,XX.XX.XX.XX:9092')

def handler(event, context):
    logger = logging.getLogger()
    event_str = json.loads(event)
    event_obj = json.loads(event_str)
    logger.info(event_obj["action"])
    logger.info(event_obj["articleAuthorId"])
    global producer
    producer.send('ikf-demo', json.dumps(event_str).encode('utf-8'))
    producer.close()
    return 'hello world'

The initializer creates a persistent Kafka producer so that subsequent invocations reuse the connection, avoiding repeated setup overhead.

Deploying with Fun (Funcraft)

Fun is a CLI tool that packages functions, uploads code, and creates the required resources from a template.yml file. The deployment steps are:

Install Fun via npm: sudo npm install @alicloud/fun -g Run fun config to set Account ID, Access Key, and default region.

Create template.yml describing the service, VPC, LogConfig, and the two functions (including the initializer and handler entries).

Install third‑party dependencies (e.g., kafka‑python) with

fun install --runtime python3 --package-type pip kafka-python

Place the Python source files ( index.py) alongside the generated .fun directory.

Deploy with fun deploy, which creates the service, functions, and uploads the code.

Connecting the Two Functions

# -*- coding: utf-8 -*-
import logging
import json
import urllib.parse
import fc2
HELLO_WORLD = b'Hello world!
'
client = None

def my_initializer(context):
    logger = logging.getLogger()
    logger.info("init fc client")
    global client
    client = fc2.Client(
        endpoint="http://your_account_id.cn-hangzhou-internal.fc.aliyuncs.com",
        accessKeyID="your_ak",
        accessKeySecret="your_sk"
    )

def handler(environ, start_response):
    logger = logging.getLogger()
    request_body_size = int(environ.get('CONTENT_LENGTH', 0))
    request_body = environ['wsgi.input'].read(request_body_size)
    request_body_str = urllib.parse.unquote(request_body.decode('GBK'))
    request_body_obj = json.loads(request_body_str)
    logger.info(request_body_obj["action"])
    logger.info(request_body_obj["articleAuthorId"])
    global client
    client.invoke_function(
        'FCBigDataDemo',
        'dataToKafka',
        payload=json.dumps(request_body_str),
        headers={'x-fc-invocation-type': 'Async'}
    )
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return [HELLO_WORLD]

The first function now calls the second one asynchronously via the FC SDK, passing the original payload. This decouples ingestion from processing, allowing each stage to scale independently.

Kafka Configuration

Using Alibaba Cloud Managed Kafka removes the operational burden of maintaining a cluster. After creating a Kafka instance, obtain the VPC‑internal or SSL endpoint and configure it in the producer code. Then create a topic (e.g., ikf-demo) and optionally adjust partition count up to 360 for higher throughput.

Performance Testing with PTS

Alibaba Cloud PTS is used to simulate up to 2 500 concurrent users sending HTTP requests to the receiveData endpoint. The test configuration includes:

Concurrency mode (users) and RPS mode (requests per second)

Gradual ramp‑up percentages

Step duration and total test time

Results show a peak of 20 000 TPS, over 5.49 million requests, with 99.99 % success rate. The occasional failures were due to client‑side timeouts, not server overload.

Conclusion

The Serverless‑based pipeline provides automatic elasticity, zero‑idle cost, and simple deployment for high‑volume game data collection. While the example focuses on the gaming industry, the same pattern applies to any big‑data ingestion scenario, and many customers have already adopted it in production.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Serverless kafka cloud deployment Function Compute Big Data Ingestion

Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.