Build a Low‑Cost, High‑Performance Game Player Profiling Platform with Alibaba Cloud EMR StarRocks
This tutorial walks you through using Alibaba Cloud EMR Serverless StarRocks and Apache Paimon to create a cost‑effective, high‑performance game player profiling and behavior analysis platform, covering data import, materialized view creation, DWD/ADS layer construction, and lakehouse integration.
Introduction
This article presents a complete workflow for building a game player profiling and behavior analysis platform using Alibaba Cloud EMR Serverless StarRocks, a fully managed serverless StarRocks service, and Apache Paimon, a new lakehouse storage technology.
Prerequisites
Prepare a VPC and switch, access the EMR‑StarRocks console, and ensure you have a free trial instance if needed.
Step 1: Create StarRocks Tables
Create the ODS layer tables ods_user_profile and ods_user_event to receive data from OSS.
--- ********************************************************************** ---
--- 初始化:创建StarRocks表,用于接收导入的OSS数据。
--- ********************************************************************** ---
-- 创建用户画像(user_profile) 与 用户行为表(user_event)
CREATE DATABASE IF NOT EXISTS game_db;
use game_db;
-- 用户信息表
CREATE TABLE IF NOT EXISTS ods_user_profile (
user_id INT NOT NULL,
registration_date DATE NOT NULL,
last_login_date DATE,
age_group VARCHAR(20),
gender VARCHAR(10),
location VARCHAR(50),
game_hours INT,
favorite_game_mode VARCHAR(20),
play_frequency VARCHAR(20),
device_type VARCHAR(20),
os_version VARCHAR(20),
current_level INT,
total_deaths INT,
active_time VARCHAR(20),
language_preference VARCHAR(10)
) PRIMARY KEY (user_id) DISTRIBUTED BY HASH(user_id) PROPERTIES ("replication_num" = "1");
-- 用户事件表
CREATE TABLE IF NOT EXISTS ods_user_event (
user_id INT,
event_type STRING,
timestamp datetime,
location STRING,
level INT,
event_details STRING
) DISTRIBUTED BY HASH(user_id) PROPERTIES ("replication_num" = "1");
truncate table game_db.ods_user_profile;
truncate table game_db.ods_user_event;Step 2: Load Data from OSS
Use Broker Load to import Parquet files from OSS into the ODS tables. Replace ${REGION} with your region (e.g., cn-hangzhou).
--- ********************************************************************** ---
--- 使用Broker Load 将OSS数据导入数据到StarRocks表中
--- ********************************************************************** ---
use game_db;
LOAD LABEL game_db.user_profile_20240902_22 (
DATA INFILE("oss://emr-starrocks-benchmark-resource-${REGION}/sr_game_demo/user_profile/*")
INTO TABLE ods_user_profile
FORMAT AS "parquet"
) WITH BROKER ("fs.oss.endpoint" = "oss-${REGION}-internal.aliyuncs.com") PROPERTIES ("timeout" = "3600");
LOAD LABEL game_db.user_event_20240902_22 (
DATA INFILE("oss://emr-starrocks-benchmark-resource-${REGION}/sr_game_demo/user_event/*")
INTO TABLE ods_user_event
FORMAT AS "parquet"
) WITH BROKER ("fs.oss.endpoint" = "oss-${REGION}-internal.aliyuncs.com") PROPERTIES ("timeout" = "3600");Step 3: Verify Data Import
use game_db;
select * from (
select count(1), 'ods_user_event' as tb from ods_user_event
union all
select count(1), 'ods_user_profile' as tb from ods_user_profile
) t;Step 4: Build DWD Layer with Materialized Views
Create materialized views that refresh every hour to transform ODS data into the DWD layer.
--- ********************************************************************** ---
--- 使用StarRocks物化视图,自动化构建数据仓库DWD层
--- ********************************************************************** ---
use game_db;
DROP MATERIALIZED VIEW IF EXISTS dwd_mv_user_profile;
CREATE MATERIALIZED VIEW IF NOT EXISTS dwd_mv_user_profile
DISTRIBUTED BY RANDOM
REFRESH ASYNC EVERY (INTERVAL 1 HOUR)
AS SELECT * FROM ods_user_profile;
DROP MATERIALIZED VIEW IF EXISTS dwd_mv_user_event;
CREATE MATERIALIZED VIEW IF NOT EXISTS dwd_mv_user_event
DISTRIBUTED BY RANDOM
REFRESH ASYNC EVERY (INTERVAL 1 HOUR)
AS SELECT * FROM ods_user_event;Step 5: Build ADS Layer with Materialized Views
Define analytical views for user retention, geographic distribution, device preference, and purchase trends.
--- ********************************************************************** ---
--- 使用StarRocks物化视图,自动化构建数据仓库ADS层
--- ********************************************************************** ---
use game_db;
-- 1. 用户留存率
CREATE MATERIALIZED VIEW IF NOT EXISTS ADS_MV_USER_RETENTION
DISTRIBUTED BY RANDOM
REFRESH ASYNC EVERY (INTERVAL 1 HOUR)
AS SELECT DATE_TRUNC('day', registration_date) AS registration_day,
DATE_TRUNC('day', last_login_date) AS last_login_day,
COUNT(DISTINCT user_id) AS users_retained
FROM dwd_mv_user_profile
GROUP BY DATE_TRUNC('day', registration_date), DATE_TRUNC('day', last_login_date);
-- 2. 用户地理分布
CREATE MATERIALIZED VIEW IF NOT EXISTS ADS_MV_USER_GEOGRAPHIC_DISTRIBUTION
DISTRIBUTED BY RANDOM
REFRESH ASYNC EVERY (INTERVAL 1 HOUR)
AS SELECT location AS geographic_location,
COUNT(DISTINCT user_id) AS total_users
FROM dwd_mv_user_profile
GROUP BY location;
-- 3. 设备使用习惯
CREATE MATERIALIZED VIEW IF NOT EXISTS ADS_MV_USER_DEVICE_PREFERENCE
DISTRIBUTED BY RANDOM
REFRESH ASYNC EVERY (INTERVAL 1 HOUR)
AS SELECT device_type,
COUNT(DISTINCT user_id) AS total_users
FROM dwd_mv_user_profile
GROUP BY device_type;
-- 4. 购买趋势
CREATE MATERIALIZED VIEW IF NOT EXISTS ADS_MV_USER_PURCHASE_TRENDS
DISTRIBUTED BY RANDOM
REFRESH ASYNC EVERY (INTERVAL 1 HOUR)
AS SELECT DATE(timestamp) AS purchase_date,
COUNT(user_id) AS daily_purchase_events
FROM dwd_mv_user_event
WHERE event_type = '购买'
GROUP BY purchase_date
ORDER BY purchase_date;Step 6: Verify ADS Views
use game_db;
select * from (
select count(1), 'ADS_MV_USER_RETENTION' as tb from ADS_MV_USER_RETENTION
union all
select count(1), 'ADS_MV_USER_GEOGRAPHIC_DISTRIBUTION' as tb from ADS_MV_USER_GEOGRAPHIC_DISTRIBUTION
union all
select count(1), 'ADS_MV_USER_DEVICE_PREFERENCE' as tb from ADS_MV_USER_DEVICE_PREFERENCE
union all
select count(1), 'ADS_MV_USER_PURCHASE_TRENDS' as tb from ADS_MV_USER_PURCHASE_TRENDS
) t;Step 7: Create RAM User and Grant Permissions
Set up an Alibaba Cloud RAM account, add the EMR‑StarRocks user role, and configure access credentials.
Step 8: Write ADS Data to Paimon Lakehouse
Create an external catalog in StarRocks that points to a Paimon lakehouse, then insert the ADS purchase trend data.
--- ********************************************************************** ---
--- 写入数据湖中(Paimon格式)- 初始化Catalog信息
--- ********************************************************************** ---
CREATE EXTERNAL CATALOG `myfirstcatalog`
PROPERTIES (
"type" = "paimon",
"paimon.catalog.type" = "dlf-paimon",
"dlf.catalog.instance.id" = "${DLF_CATALOG_ID}"
);
create database if not exists myfirstcatalog.game_db;
CREATE TABLE IF NOT EXISTS myfirstcatalog.game_db.ADS_USER_PURCHASE_TRENDS(
purchase_date DATE COMMENT '购买日期',
daily_purchase_events INT COMMENT '每日购买事件数量'
);
INSERT INTO myfirstcatalog.game_db.ADS_USER_PURCHASE_TRENDS
SELECT * FROM default_catalog.game_db.ADS_MV_USER_PURCHASE_TRENDS;Step 9: Verify Data in Paimon
select count(1), 'myfirstcatalog.game_db.ADS_USER_PURCHASE_TRENDS' as tb
from myfirstcatalog.game_db.ADS_USER_PURCHASE_TRENDS;After completing the experiment, remember to delete the resources to avoid ongoing charges.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
