How a Bank’s Veteran Engineer Achieved Seamless Mainframe Disaster Recovery
In this interview, senior China Bank systems engineer Lu Yang shares his 34‑year journey in mainframe operations, detailing the 2018 seamless disaster‑recovery switch, the importance of focus, continuous learning, risk sense, and future trends such as AIOps, security, and the enduring value of mainframe technology.
Interview Introduction
Technology evolves rapidly; the interview explores how operations professionals can stay true to their purpose, work diligently, and excel. The High‑Efficiency Operations community interviewed Lu Yang, a senior engineer in China Bank's system management division, who has been with the bank since 1985 and has spent over three decades in system operations.
Background and Mainframe Experience
Lu Yang describes his early work on mainframe platforms, noting that thirty years ago mainframes were the "black technology" of financial IT, comparable today to cloud computing and big data. He takes pride in having devoted his career to mainframe systems.
Key Projects and 2018 Seamless Disaster‑Recovery
He participated in many IT system construction and transformation projects, including logical centralization, the two‑site three‑center architecture, and global integration. His most notable achievement was the 2018 same‑city disaster‑recovery switch for the bank’s mainframe platform, which achieved a truly seamless transition with no transaction interruption—a first in the bank’s history and later recognized by IBM as a global best practice.
Technical Insight: SYSPLEX Concept
Lu Yang proposed building a cross‑site parallel‑coupled environment (SYSPLEX) that would keep both production and backup sites in a SYSPLEX state during gradual shutdown and startup, ensuring continuous transaction processing. After eight months of testing—including network, system, and transaction layers—the concept proved feasible.
Career Philosophy and Advice for Young Professionals
He emphasizes the importance of focus, deep diving into technology, and meticulous note‑taking. While younger engineers have access to abundant information, they often lack depth and become "jack‑of‑all‑trades". Lu Yang advises cultivating disciplined learning habits, encouraging continuous education, and balancing Dev (development) and Ops (operations) skills.
Respect for Details and Risk Sense
Operations work demands attention to detail; even small actions like disk formatting can have critical consequences. Lu Yang shares an incident where a system hang was quickly resolved by recognizing a subtle subsystem issue, illustrating the value of a well‑developed risk‑sense cultivated through experience and systematic fault analysis.
Communication, Continuous Learning, and Digital Transformation
Effective communication and ongoing learning are essential. He likens knowledge sharing to a well‑sized well‑head, enabling deeper drilling. The shift toward digital transformation and the convergence of operations and development require broader understanding of open‑source, distributed architectures, and business applications.
Motivation and Curiosity
Lu Yang attributes his enduring motivation to a curiosity about life and a desire to understand even seemingly trivial system messages, which often reveal hidden issues. He stresses the need to avoid being swayed solely by trends and to maintain a genuine interest in underlying technologies.
Mainframe Advantages Over Open‑Source Platforms
He outlines three core advantages: (1) hardware‑level stability and security through strict layered architecture and firmware‑based virtualization; (2) proprietary communication protocols that reduce network dependency; (3) fine‑grained security management and strong transaction consistency guaranteed by system‑level mechanisms.
Lock Mechanism Comparison
In mainframe environments, lock structures reside in dedicated hardware storage (COUPLING FACILITY) and are accessed via operating‑system services, whereas open‑source solutions rely on software components. This results in higher performance and reliability for mainframe locks.
Oracle and DB2 Perspectives
While Oracle RAC provides lock mechanisms, they are implemented in the database’s memory and communicated over generic network protocols, unlike the mainframe’s hardware‑assisted approach. DB2 disaster‑recovery is achieved through disk‑level synchronous replication across a 40 km distance, ensuring zero data loss and transaction integrity.
Future Trends in Operations
Lu Yang sees strong growth for AIOps and data‑driven automation, but stresses that solid data governance, reduction of false alarms, and human decision‑making remain essential foundations. He also highlights the increasing importance of network and data security for large enterprises and government institutions.
Coexistence of Centralized and Distributed Architectures
He believes centralized mainframe platforms will continue to host core banking and sensitive transaction data due to their stability and security, while distributed platforms will serve mobile apps, channel services, and analytics workloads.
Images
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.