Fundamentals 11 min read

Building a GPU Scoring System for Mobile Game Performance

This article explains how a performance optimization team quickly designed a hardware scoring framework for mobile games, detailing the definition of key terms, the background and methodology of hardware scoring, the analysis of major GPU families, and a step‑by‑step guide to creating and applying scoring rules in a tight project timeline.

ByteDance SE Lab

Nov 19, 2021

Building a GPU Scoring System for Mobile Game Performance

Parameter Explanation

When the performance optimization team joined Project X, they had only about one month to analyze and tune performance. Under tight deadlines, they prioritized two focus areas: normalizing hardware capabilities to ensure overall performance and establishing online/offline monitoring to expose issues promptly.

Background of Hardware Scoring

Hardware scoring normalizes multi‑dimensional hardware capabilities (CPU, GPU, memory) to facilitate the creation of graphics quality tiers. The challenge lies in selecting which dimensions to normalize and how to do it. Game performance mainly concerns bandwidth, device memory size, CPU core count, CPU frequency, GPU frequency, and GPU Gflops.

Solution Research

Other teams had mature scoring systems similar to AnTuTu, which abstract hardware capabilities and assign scores based on demo measurements and weighted normalization. However, those systems focus on audio/video decoding, differing from game‑focused hardware metrics, so a dedicated solution was needed.

Two approaches were considered: (1) using game performance to create a ladder by testing and measuring FPS (mean, std, jank) for device tiering, which proved too dependent on test scenes and tester skill; (2) estimating performance based on device parameters (memory bandwidth, memory size, CPU cores, CPU frequency, GPU frequency, GPU Gflops). The latter was chosen despite manufacturers providing ideal lab specs that differ from real‑world performance due to DVFS and driver efficiency variations.

Given the tight schedule of Project X and the high impact of GPU on graphics‑intensive games, the team directly adopted a GPU‑centric tiering strategy, referencing existing CPU ladder charts.

Solution Details

Mobile GPUs are mainly from Adreno, Mali, and PowerVR. Adreno GPUs (Qualcomm) dominate Android with >60% market share, so scoring rules focus on the Adreno series, using Gflops for ranking. Mali GPUs (ARM) have configurable cores (MPx), and PowerVR GPUs have complex naming; both are handled via white‑list mapping.

Adreno Series

Adreno naming follows a pattern (e.g., Adreno 512). The team divided devices into three tiers (140, 160, 400) and refined boundaries with real‑device testing, relying on previously reported GPM data.

Mali Series

Mali naming evolved from early 2xx/3xx to T series (e.g., T860) and now G series (e.g., G72). Important points: the same core can have different core counts (MPx), core configurations affect performance significantly, and some MP values are missing and need white‑list handling.

PowerVR Series

PowerVR naming includes SGX, G, GX, GT, GE, GM series. Because PowerVR market share is low, the team used a simple white‑list approach for tiering.

GPU Ladder Chart

The chart illustrates device tiering in Project X, highlighting large baseline differences between games and emphasizing relative GPU relationships. It notes that for GPU‑bound games, the GPU ladder applies, while CPU‑bound games should refer to the CPU ladder. Brackets show Gflops values; missing values indicate unavailable data. Multiple values separated by commas represent different frequencies or core configurations; a dash indicates a range.

How to Create Scoring Rules

For a new project, start with a mid‑range chip (e.g., Adreno 512) and test core game scenes. If performance meets targets, move to the next higher‑volume popular chip (e.g., Adreno 505) and repeat until the performance threshold is reached. Non‑popular chips can be tiered conservatively based on nearby popular chips, with later calibration using GPM data or binary search if time permits.

Identify a baseline mid‑range GPU based on market share and Gflops.

Conduct real‑device testing on core game scenarios.

Iteratively move to higher‑performance GPUs until the performance ceiling is reached.

Document the tier boundaries and create a white‑list for less common GPUs.

Refine the scoring model with telemetry data post‑release.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mobile performance Adreno hardware benchmarking Mali GPU scoring

Written by

ByteDance SE Lab

Official account of ByteDance SE Lab, sharing research and practical experience in software engineering. Our lab unites researchers and engineers from various domains to accelerate the fusion of software engineering and AI, driving technological progress in every phase of software development.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.