How Ningbo Bank Boosted System Reliability with SRE: Lessons from a 3‑Level Assessment
Ningbo Bank’s personal mobile banking system passed the SRE Level‑3 assessment, showcasing how systematic SRE practices, metric‑driven reliability engineering, and cross‑team collaboration can dramatically improve system stability, reduce failures, and support digital transformation in the financial sector.
With rapid digital technology updates, the importance of information systems has become evident, and system stability faces new challenges. As the number of systems and business scale continue to grow, operations technology evolves, and the concept of Site Reliability Engineering (SRE) is increasingly adopted by the industry.
On October 17, 2025, at the 27th GOPS Global Operations Conference in Shanghai, the China Academy of Information and Communications Technology announced the 35th batch of ITU DevOps International Standard / Domestic DevOps Standard dual‑certificate assessment results. Ningbo Bank’s personal mobile banking system successfully passed the China Academy’s SOMM Operations Maturity Level‑3 assessment for the “Information System Stability Assurance Capability Grading Requirements (SRE)”, marking a significant leap in its operational capabilities.
As one of the first small‑ and medium‑size banks to achieve this SRE assessment, Ningbo Bank’s experience provides valuable insights.
Interview with Ong Yuzhe, Deputy General Manager of the Financial Technology Department, Ningbo Bank
Q: Please introduce yourself, your company, and the project you evaluated.
Ong Yuzhe explained that Ningbo Bank is one of the 20 system‑important banks in China, ranking 72nd in the 2025 Global Bank 1000 list. The bank places great emphasis on fintech, offering a comprehensive, 24/7 mobile banking platform (iOS/Android/HarmonyOS) that integrates payment, investment, credit, and lifestyle services.
Q: How does achieving the SRE Level‑3 assessment feel?
He expressed great honor, noting that the achievement reflects strong internal support and teamwork, and marks a milestone for the bank’s system stability efforts.
Q: Why is building a stable and reliable IT system crucial for enterprise development?
System stability is not only a technical capability but also a core competitive advantage, especially in finance where it underpins business continuity and customer trust. With millions of mobile banking users, improving stability reduces risk, lowers operational costs, and enhances customer satisfaction and market competitiveness.
Q: Why did you choose to participate in this SRE assessment?
The bank aimed to benchmark against industry best practices, validate its own stability improvements (observability, fault‑tolerance, chaos engineering), and establish a long‑term improvement mechanism tailored to its needs.
Q: What changes did the SRE assessment bring to the enterprise and team?
The assessment reinforced fault prevention, monitoring, incident handling, and optimization capabilities, achieving deep integration of SLOs with technical operations and improving overall system reliability.
Q: What specific improvements were realized after the assessment?
Expanded SLO metrics to include traffic, saturation, latency, and error rates, introducing an error‑budget mechanism for quantitative stability management.
Enhanced observability through comprehensive monitoring, alerts, and visual system topology maps.
Established a mature incident‑response drill system, improving emergency handling and driving continuous improvement of alerts and response plans.
As a result, the average mean‑time‑between‑failures increased by 243% year‑over‑year, and the number of incidents dropped by 63%.
Q: What challenges did you encounter during the assessment and how were they solved?
The main challenges were aligning the generic SRE standards with the bank’s specific architecture and fostering cross‑functional collaboration. The bank addressed these by conducting gap analyses, phased iterations, and forming a virtual stability team to promote SRE culture across development, testing, management, and business units.
Q: What successful experiences can you share about implementing SRE internally?
Key factors include shifting team mindset toward engineering excellence, establishing benchmark projects (such as the personal mobile banking system) to demonstrate value, and investing in automation tools while cultivating a culture of shared responsibility for stability.
Q: What are your future plans for SRE work?
The bank will (1) replicate the mobile banking success across other critical systems, (2) standardize and integrate operational tools to enhance observability, and (3) deepen SRE cultural practices—error budgeting, post‑mortems, and cross‑team collaboration—to make “stability first” a daily consensus.
Q: How do you see the future development and trends of SRE?
He believes SRE will evolve toward platformization, intelligence, and value‑orientation, becoming a bridge between technical investment and business outcomes, supporting capacity planning, risk pricing, and product innovation.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
