Why Traditional Ops Platforms Fail and How to Build an Effective Ops Middle Platform
This article analyses the shortcomings of fragmented operations platforms, explains how organizational and business domain factors shape a successful ops middle platform, and presents a lifecycle‑driven, multi‑layer integration approach for building a scalable, cloud‑native operations ecosystem.
Introduction
A viral post claimed that a two‑year middle‑platform project was cancelled and its CIO was fired, labeling the middle platform a "shortest joke" and a "metaphysical" concept. The article argues that a middle platform is not just a technical issue but also an organizational and business challenge.
Chapter 1 – Past Ops Platform Construction Approaches
Since the end of 2014, the rise of internet‑scale operations has led traditional industries to focus on building ops platforms. The evolution is typically described in five stages:
Manual Ops : Human‑performed tasks such as release, incident handling, and inspection.
Scripted Ops : Automation scripts replace manual steps, though many scripts still encapsulate human actions.
Automated Platform Ops : Visual platforms encapsulate scenario‑based jobs, including channels, atomic job libraries, and orchestration.
Data‑Driven Ops : Automation reduces manual labor; fine‑grained operations rely on data to drive, express, and optimize processes (ITOA).
Intelligent Ops : AI models take over tasks such as root‑cause analysis, impact analysis, prediction, and anomaly detection, aiming for AIOps and eventually NoOps.
Past ops platform construction suffered from fragmentation and ad‑hoc projects because of:
Lack of overall planning : Few ops departments propose comprehensive designs.
Siloed organization : Functional silos prevent cross‑department knowledge sharing; examples include CMDB built under a siloed structure.
Legacy accumulation : Historical baggage is inevitable, but rebuilding without continuity leads to repeated effort.
Over‑reliance on vendors : Multiple vendors create overlapping solutions, resulting in unclear responsibilities and stagnant evolution.
Ops goals—high availability, continuity, cost, efficiency, and quality—cannot be met by fragmented platforms.
Chapter 2 – Ops Organization Structure Discussion
The article revisits Conway's Law and its inverse, highlighting the mismatch between organizational and system architectures. A functional org (network, DB, security, NOC) often leads to fragmented platforms. The author advocates an "application‑oriented ops + ops development" structure, similar to TOGAF, and stresses that cloud‑native applications will deepen this shift.
Ops development can follow three models based on company size and resources:
Self‑built : Requires substantial R&D investment; suitable for large enterprises.
Co‑development : Core capabilities are outsourced but developed openly with partners; fits medium‑size firms.
External development : Platforms are fully purchased from third parties; appropriate for small‑to‑mid companies.
The distinction between modular platforms and fragmented ones is emphasized, noting that modularity does not equal fragmentation.
Chapter 3 – Ops Business Domain Division
Using a lifecycle view, the ops process is split into four phases: asset delivery, resource delivery, application delivery, and operational management. From this, nine business domains are defined:
Asset Management Domain (budget, procurement, delivery)
Resource Management Domain (unified IT resource management)
Resource Delivery Domain (cloud resource provisioning)
Application Delivery Domain (deployment state)
Application Runtime Domain (running state)
Service Delivery Domain (deployment state)
Service Runtime Domain (running state)
Operation Management Domain (process management)
Operation Scheduling Domain (operational management)
Clarifying these domains sets the boundaries for platform construction and avoids the “big‑and‑all” approach.
Chapter 4 – How an Ops Middle Platform Forms
The article outlines four integration layers that together create a complete ops middle platform:
Portal‑level URI integration : Unified entry points for tasks, services, information, and products.
Micro‑app UI integration : Re‑package service‑platform APIs as micro‑apps for personalized delivery.
API‑gateway integration : Consolidate capabilities from multiple platforms behind a unified gateway.
CMDB data integration : Adopt a "one data model" to integrate static data across systems.
These layers are illustrated with diagrams (see images below).
Four‑layer example:
General Capability Layer – shared technical capabilities.
Service Middle‑Platform Layer – reusable business capabilities organized by domain.
Micro‑App Layer – personalized capability packaging.
Portal Layer – aggregates multiple dimensions for end‑users.
Chapter 5 – Industry‑Specific Ops Middle Platform Deployment
The article warns against building a middle platform for its own sake, emphasizing business goals, API openness, and rapid application development (RAD). It stresses the need for:
Breaking silos in large ops organizations.
Providing an open API ecosystem.
Offering a RAD environment for custom extensions.
Defining clear roles for ops developers and experts.
Leadership must understand ops business objectives; otherwise, technology‑first approaches lead to failure. The middle platform should be built on a solid lifecycle and domain division, not as a brand‑new “magic” solution.
Conclusion
Do not assume a middle platform solves everything; focus on the influencing factors, service system, and capability construction first, then evolve toward a true middle‑platform architecture.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.