Big Data Cloud‑Native Trends and Challenges Highlighted at the 2023 Yunqi Conference
The 2023 Yunqi Conference in Hangzhou showcased the latest advances in cloud computing and big‑data technologies, examined the evolution from big‑data 1.0 to 3.0, discussed the key difficulties of making big data cloud‑native, and presented a practical case study of MiHoYo’s cloud‑native transformation.
The 2023 Yunqi Conference, Alibaba Cloud's flagship event, focused on two major themes—large‑model AI and cloud computing. The conference highlighted how cloud computing underpins AI workloads and, in the afternoon, product teams presented progress across containers, storage, networking, databases, and big‑data services.
1. Cloud technology progress – Elastic Compute introduced a third‑generation platform built on the CIPU+ Feitian OS, supporting Intel, self‑developed Yitian‑710, and AMD chips with differentiated pricing for economical, HPC, and high‑stability instances. The ECI container service can launch up to 2 million instances per day. OSS storage now offers three tiers (standard, infrequent, archive) with direct‑read capability and bandwidth up to 100 Gbps, enabling a 270 GB model to be read in ~20 seconds. Alibaba’s Feitian network (Feitian Luoshen) improves high‑performance network access and routing, addressing the shift from VM‑centric to container‑centric networking. Managed K8s adoption reaches 64 % of production workloads, with a 127 % growth rate and 73 % market share over self‑managed clusters. Database updates include Yaochi RDS, PolarDB (with built‑in cache and read‑write consistency), and NoSQL offerings such as Lindorm.
2. Big‑data technology development and current status – Big‑data 1.0 began with Google’s 2003 papers (GFS, Bigtable, MapReduce) and led to Hadoop’s ecosystem (HDFS, HBase, YARN, MapReduce). Big‑data 2.0 introduced Hive, Spark, Storm, and OLAP engines like ClickHouse, Kylin, Druid, moving toward SQL‑like interfaces. Real‑time processing entered 2.5 with Google Dataflow (2015) and Flink. Recent years see cloud migration, K8s adoption, and the rise of MPP databases (StarRocks, Doris) and lake‑house technologies (Hudi, Iceberg, Delta, Paimon) that bridge object storage and table layers.
3. Cloud‑native challenges for big data – While storage can be replaced by cloud services (e.g., OSS‑HDFS), compute and scheduling remain difficult. Issues include rapid scaling of Spark containers (thousands of pods), network provisioning for pod creation and bandwidth, and limited container disk space causing shuffle‑data bottlenecks. Achieving cloud‑native big data therefore requires redesign of storage, compute, and orchestration layers.
4. Feasibility and a practical case study – The conference demonstrated that core big‑data infrastructure (Serverless containers, high‑performance networking, OSS bandwidth, OSS‑HDFS) has matured. Open‑source remote‑shuffle projects like Celeborn further enable cloud‑native execution. A detailed case study from MiHoYo showed how migrating Spark workloads to Alibaba Cloud Container Service for Kubernetes (ACK) achieved elastic scaling, 50 % cost reduction via spot instances, and compute‑storage decoupling using OSS‑HDFS and Celeborn.
For more details, see the official technical keynote page ( https://yunqi.aliyun.com/2023/techkeynotesession ) and the referenced articles on big‑data 3.0 and MiHoYo’s cloud‑native practice.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
