Big Data 12 min read

Snowball Data Middle Platform (AIBO): Architecture, Capabilities, and Future Outlook

The article introduces Snowball's AIBO data middle platform, detailing its storage‑compute separation architecture, core capabilities such as data integration, catalog, tagging, analysis tools, micro‑service data APIs, and outlines future enhancements for security, lineage, and continuous business‑driven iteration.

Snowball Engineer Team
Snowball Engineer Team
Snowball Engineer Team
Snowball Data Middle Platform (AIBO): Architecture, Capabilities, and Future Outlook

Background

Snowball is China’s largest stock‑investment community, aiming to "connect everything about investing" and to become the world’s biggest investment communication and trading platform. To achieve this, Snowball created a big‑data department responsible for building a data middle platform that can quickly integrate, govern, and deliver data for reporting, analysis, and decision‑making across all business lines.

Data Middle Platform AIBO – Connecting All Data

To integrate diverse data sources rapidly, AIBO separates storage from computation, provides convenient analysis tools, manages permissions and data catalogs, and offers data APIs for fast business response. The platform is designed specifically for Snowball, as generic services cannot meet all its unique business needs.

The essential capabilities of an enterprise data middle platform, as identified for Snowball, include:

Data Integration : Fast integration of structured and unstructured data.

Data Catalog : Unified inventory of important data assets.

Data Tagging & Linking : Tagging users, stocks, posts, ads, etc., and linking them across business lines.

Data Analysis : Flexible, unified analysis tools for business value exploration.

Data Permissions : Fine‑grained access control to prevent leaks while enabling sharing.

Data Services : Micro‑service‑driven data APIs with governance, high availability, and iterative improvement.

Data Integration

AIBO adopts a storage‑compute separation design; data is ETL‑ed and stored in a Hive data warehouse. Sources include:

Kafka topics synchronized to Hive via Flume.

MySQL business databases imported via Sqoop and custom interfaces.

Custom ETL jobs using SQL, Python, Shell, etc.

Dependency scheduling with DolphinScheduler.

Data Catalog – User+Event

The catalog serves as a data‑management hub, focusing on user‑centric data models (User + Event) that capture who performed what action, when, and with which attributes, forming the basis for user‑behavior analysis.

Data Tagging and Linking – USER+EVENT+Item

Tags describe user attributes (age, gender, followers, etc.) and item attributes (stock industry, P/E ratio, ad type, etc.). By linking event foreign keys to user, stock, or ad items, the platform can traverse and enrich data across domains.

Data Analysis – General Model Capabilities

AIBO provides a suite of analysis tools:

Event analysis with custom dimensions and metrics.

Retention analysis for user activity over time.

Funnel analysis to track multi‑step conversion.

Custom dashboards for saved charts and sharing.

Group comparison across any model or dimension.

ABTest support for product experiments with real‑time metrics.

Event Analysis

Analyzes user actions (app launch, registration, post view, deposit, etc.) across custom dimensions (app version, device, OS) and metrics, supporting grouping, various chart types, and data download.

Retention Analysis

Measures how many users continue to perform subsequent actions after an initial event, with customizable dimensions and grouping for cohort analysis.

Funnel Analysis

Tracks conversion and drop‑off at each step of a multi‑stage process, allowing custom steps, filters, group comparisons, and multiple visualizations.

Custom Dashboards

Users can save charts from event, retention, funnel, etc., assemble them into personalized dashboards, and share with designated colleagues for ongoing monitoring.

Group Comparison

All flexible models support comparing multiple user groups defined by events, filters, uploaded files, or dynamic conditions.

ABTest

Enables product upgrades by defining multiple page versions and target user groups, providing T+0 and T+1 statistical reports and allowing custom core and auxiliary metrics.

Data Services

Implemented as micro‑services, data services expose unified APIs and reusable data models, allowing business lines to compose and share data quickly while maintaining governance.

Summary and Outlook

Snowball’s data middle platform adopts a storage‑compute separation architecture and micro‑service‑based data APIs, forming a closed loop of data integration, cataloging, analysis, and application. Future work will enhance fine‑grained security, data lineage visibility, and continue expanding AIBO’s capabilities to further empower data‑driven business growth.

Technical enthusiasts can follow the Snowball engineering team’s public account for upcoming module‑level design and technology shares.

big dataMicroservicesdata analysisdata platformdata integrationdata catalog
Snowball Engineer Team
Written by

Snowball Engineer Team

Proactivity, efficiency, professionalism, and empathy are the core values of the Snowball Engineer Team; curiosity, passion, and sharing of technology drive their continuous progress.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.