Big Data 10 min read

Building an Intelligent Data Warehouse at Yixin Group: A Big Data Platform Case Study

The article describes how Yixin Group’s product team created an in‑house intelligent data warehouse using Hadoop, Flink/Spark, and standardized data services to transform scattered automotive‑finance data into a secure, scalable platform that supports real‑time analytics and drives business growth.

DataFunTalk
DataFunTalk
DataFunTalk
Building an Intelligent Data Warehouse at Yixin Group: A Big Data Platform Case Study

At the start of the post‑holiday period, Yixin Group’s product manager Wang Yang needed to compile operational data from hundreds of products and channels, a task that previously required a full day of work by a data‑analysis team. With the newly built "Intelligent Data Warehouse" (referred to as "Smart DW"), the same work can be completed in just 30 minutes.

Yixin, a leading automotive‑finance platform with over 2 million vehicle transactions, treats the Smart DW as its "core brain," providing robust technological support for stable, efficient operation across multiple business lines.

The first employee of the Smart DW project, Xu Fei, joined when Yixin had no unified data warehouse; data were scattered across dozens of isolated systems, requiring manual reporting and suffering from inconsistent standards. Drawing on experience from large internet companies and his own startup background, Xu helped design a precise, forward‑looking architecture.

Key requirements for the warehouse included high security, stability, precise data, and easy usability, as well as the ability to handle massive, heterogeneous data sources, support multi‑project workloads, and scale infinitely to meet future growth.

Team members, such as Lao Qin, emphasized three capabilities: unified I/O standards for massive heterogeneous data, cross‑system data standardization, and a data‑service layer for analytics, mining, and reporting. To meet storage and scaling needs, they built a distributed compute‑storage platform on Hadoop, supplemented with real‑time engines Flink and Spark, deploying over 100 nodes that enable both batch and sub‑second processing.

Data from nearly a hundred source systems (about 1.5 TB daily) are ingested, cleaned, and modeled by business objects, cycles, and user dimensions, then stored in a well‑organized manner. Challenges included reconciling disparate business fields and redefining metrics such as customer churn rates, which required iterative validation and custom indicator definitions.

The Smart DW now processes over 10,000 concurrent compute tasks, delivering data to mobile and PC interfaces in real time, and providing standardized data packages to various business units.

Business users like asset‑management manager Fu Zhen can now obtain daily asset‑quality reports with a single click via the BI module, thanks to the data‑service middle platform. The platform offers three data‑service forms: a "Tongtianxiao" service portal with hundreds of dimensions, a customizable BI platform for self‑service reporting, and standardized data interfaces that automatically dispatch tasks to downstream systems.

Security measures include strict authorization, encryption, and de‑identification of personal information, complying with regulations since 2018.

Overall, the Intelligent Data Warehouse has become a strategic infrastructure that enables Yixin to shift from manual, rule‑based risk assessment to data‑driven models, improve customer outcomes, and explore new fintech collaborations, positioning the company as a technology‑driven leader in the automotive‑finance industry.

Data Engineeringbig dataFlinkData WarehouseSparkHadoopAutomotive Finance
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.