From Contributor to Committer: Lessons from ByteDance’s Apache Flink Journey
ByteDance’s streaming computing team members Fang Yong and Hu Weihua share their path from early Flink adopters to Apache Flink Committers, detailing their contributions to Runtime Coordinator and Streaming Warehouse, the challenges of open‑source involvement, and practical advice for developers seeking to engage with the Flink community.
Interview Overview
This article is based on an interview with Fang Yong and Hu Weihua from ByteDance’s streaming computing team. Both have contributed major features such as Runtime Coordinator and Streaming Warehouse to the Apache Flink project and were officially invited to become Apache Flink Committers in July 2023.
Why Open Source Matters
In the software development world, open source has become a mainstream focus. Since 2017, ByteDance’s streaming computing team has been using Apache Flink as its stream processing engine and has gradually increased its investment in the open‑source community.
Becoming a Committer
In the past two months, Fang and Hu were each invited to become Apache Flink Committers. This interview explores their personal journeys and motivations for contributing to the open‑source community.
My Open‑source Journey
Apache Flink is a high‑performance distributed computing framework that has become the de‑facto standard for stream processing. The community responds quickly to user questions (often within a day) and maintains a high technical standard. Flink’s ecosystem has expanded to include streaming‑batch integration, OLAP, and Streaming Warehouse, all of which have been adopted within ByteDance.
As a Flink Runtime engineer, I have deepened my understanding of the project’s design principles and felt motivated to give back. I regularly answer community questions, contribute code to scheduling and resource management, and focus on Runtime Coordinator work. Internally, we also develop custom features that we aim to upstream to the community.
Actively answer user questions and help them use Flink effectively.
Contribute code to improve Flink scheduling performance and reduce maintenance costs.
In August 2023, I was honored to be invited as an Apache Flink Committer.
My current effort centers on Runtime Coordinator development. We have internal customizations that we actively contribute back to the community, and we prioritize upstreaming new features.
Participating in Open‑source Is a “Charging Journey”
Hu Weihua: I believe open‑source participation benefits individuals, teams, companies, and the community. Individuals improve their technical skills and broaden their solution space. Teams foster innovation and avoid isolated development, which is crucial for ByteDance’s extensive use of Flink. Companies enhance their brand and technical reputation, while broader community involvement accelerates problem solving.
During a performance‑optimization project, we initially changed the Flink job deployment workflow, but after extensive community discussion we shifted to adding a cache in the TaskManager, achieving the same goal with less disruption. This experience deepened my understanding of community processes and the power of collaborative development.
Open‑source work has also expanded my technical perspective. By answering user questions, I have encountered diverse business scenarios, which broadened my thinking. Code reviews with seasoned Committers and PMC members have accelerated my technical growth.
“Pioneer” Tips
Hu Weihua: Be bold and meticulous—express your ideas confidently, and communicate clearly to reduce friction.
Fang Yong: My key suggestions are:
Gain a deeper understanding of open‑source community mechanisms and learn how to encourage teammates to contribute.
Build relationships with peers in related fields to stay informed about industry trends.
Stay updated on the community’s roadmap and core feature directions to align internal planning.
Leverage the community’s emphasis on rationality and scalability to drive internal technical growth.
ByteDance Streaming Computing Team
The team supports a wide range of core ByteDance services—including machine‑learning platforms, recommendation, data warehousing, search, advertising, streaming media, security, and risk control. We address challenges of massive single‑job workloads (tens of millions of QPS) and large‑scale clusters (tens of thousands of machines), optimizing Flink’s SQL, state & checkpoint, and runtime components.
In 2022, our Flink‑based streaming compute product was launched on Volcano Engine, providing cloud‑native compute capabilities to external users.
Volcano Engine Developer Services
The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
