Snowball Data Middle Platform (AIBO): Architecture, Capabilities, and Future Outlook
The article introduces Snowball's AIBO data middle platform, detailing its storage‑compute separation architecture, core capabilities such as data integration, catalog, tagging, analysis tools, micro‑service data APIs, and outlines future enhancements for security, lineage, and continuous business‑driven iteration.
Background
Snowball is China’s largest stock‑investment community, aiming to "connect everything about investing" and to become the world’s biggest investment communication and trading platform. To achieve this, Snowball created a big‑data department responsible for building a data middle platform that can quickly integrate, govern, and deliver data for reporting, analysis, and decision‑making across all business lines.
Data Middle Platform AIBO – Connecting All Data
To integrate diverse data sources rapidly, AIBO separates storage from computation, provides convenient analysis tools, manages permissions and data catalogs, and offers data APIs for fast business response. The platform is designed specifically for Snowball, as generic services cannot meet all its unique business needs.
The essential capabilities of an enterprise data middle platform, as identified for Snowball, include:
Data Integration : Fast integration of structured and unstructured data.
Data Catalog : Unified inventory of important data assets.
Data Tagging & Linking : Tagging users, stocks, posts, ads, etc., and linking them across business lines.
Data Analysis : Flexible, unified analysis tools for business value exploration.
Data Permissions : Fine‑grained access control to prevent leaks while enabling sharing.
Data Services : Micro‑service‑driven data APIs with governance, high availability, and iterative improvement.
Data Integration
AIBO adopts a storage‑compute separation design; data is ETL‑ed and stored in a Hive data warehouse. Sources include:
Kafka topics synchronized to Hive via Flume.
MySQL business databases imported via Sqoop and custom interfaces.
Custom ETL jobs using SQL, Python, Shell, etc.
Dependency scheduling with DolphinScheduler.
Data Catalog – User+Event
The catalog serves as a data‑management hub, focusing on user‑centric data models (User + Event) that capture who performed what action, when, and with which attributes, forming the basis for user‑behavior analysis.
Data Tagging and Linking – USER+EVENT+Item
Tags describe user attributes (age, gender, followers, etc.) and item attributes (stock industry, P/E ratio, ad type, etc.). By linking event foreign keys to user, stock, or ad items, the platform can traverse and enrich data across domains.
Data Analysis – General Model Capabilities
AIBO provides a suite of analysis tools:
Event analysis with custom dimensions and metrics.
Retention analysis for user activity over time.
Funnel analysis to track multi‑step conversion.
Custom dashboards for saved charts and sharing.
Group comparison across any model or dimension.
ABTest support for product experiments with real‑time metrics.
Event Analysis
Analyzes user actions (app launch, registration, post view, deposit, etc.) across custom dimensions (app version, device, OS) and metrics, supporting grouping, various chart types, and data download.
Retention Analysis
Measures how many users continue to perform subsequent actions after an initial event, with customizable dimensions and grouping for cohort analysis.
Funnel Analysis
Tracks conversion and drop‑off at each step of a multi‑stage process, allowing custom steps, filters, group comparisons, and multiple visualizations.
Custom Dashboards
Users can save charts from event, retention, funnel, etc., assemble them into personalized dashboards, and share with designated colleagues for ongoing monitoring.
Group Comparison
All flexible models support comparing multiple user groups defined by events, filters, uploaded files, or dynamic conditions.
ABTest
Enables product upgrades by defining multiple page versions and target user groups, providing T+0 and T+1 statistical reports and allowing custom core and auxiliary metrics.
Data Services
Implemented as micro‑services, data services expose unified APIs and reusable data models, allowing business lines to compose and share data quickly while maintaining governance.
Summary and Outlook
Snowball’s data middle platform adopts a storage‑compute separation architecture and micro‑service‑based data APIs, forming a closed loop of data integration, cataloging, analysis, and application. Future work will enhance fine‑grained security, data lineage visibility, and continue expanding AIBO’s capabilities to further empower data‑driven business growth.
Technical enthusiasts can follow the Snowball engineering team’s public account for upcoming module‑level design and technology shares.
Snowball Engineer Team
Proactivity, efficiency, professionalism, and empathy are the core values of the Snowball Engineer Team; curiosity, passion, and sharing of technology drive their continuous progress.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.