Improving Iteration Outcomes with Cumulative Flow Diagrams and Environment Readiness in a Complex Microservice System
The article analyzes how applying cumulative flow diagram insights, tightening iteration commitments, and establishing standardized, automated SIT/UAT environments dramatically reduced risks, overtime, and untested deployments in a multi‑team microservice platform, highlighting the importance of testability and operational readiness for continuous delivery.
About half a month ago an article discussed using cumulative flow diagrams to discover team execution problems. In the following iteration the team applied three recommendations: reduce the amount of work per iteration and involve developers in the next iteration’s demand preparation; schedule multiple quality acceptance points (e.g., Wednesdays and Fridays) with product, delivery managers and possibly developers; and fix bugs as early as possible.
While the focus is not on why developers should join demand analysis, the post shows the results after the improvements. Two cumulative flow diagrams compare the previous and the current iteration, revealing that the team adhered to the agreements, exposed risks early, and reduced overtime before the iteration ended.
However, the new iteration also showed fewer UAT‑deployed demands and almost no "project‑wide acceptance". The root cause was identified through daily stand‑ups: the environment preparation workflow was broken.
Dialogue example
Product acceptance person: Yesterday’s demand cannot be accepted because module M (a microservice) is not yet deployed to SIT.
Sub‑domain B: The bug you reported is actually caused by module N not having the latest code in UAT.
Sub‑domain C: My part is finished, but sub‑domain D’s module is not ready, so acceptance is impossible.
Sub‑domain D: Our feature is blocked by an external team’s dependency; they need two more days to release.
Sub‑domain A: Our service runs on the development network and cannot reach the production network, so we must devise a new solution.
Sub‑domain C: The bug you mentioned comes from a new version of sub‑domain P’s module.
Sub‑domain A: The bug is actually a mismatch in data‑interface format; the original was XXX, now it must be YYY.
The system is a complex microservice‑based integration platform with four collaborating sub‑domains, six subsystems, dozens of microservices, deployments across two networks (development and production), and each subsystem needing integration with other platforms.
注:
上图并不包含类似于服务发现之类的公共标准依赖服务,以及更底层的数据库基础设施服务等。
图中的外部依赖全部为业务依赖,即:需要这些依赖服务提供业务输出。The testing environment is equally complex because parts of the system are deployed on both the development and production networks, requiring test environments to span both networks as well.
Initially, the team followed old habits: some developers created feature branches and deployed directly to SIT for debugging, while others used UAT because SIT was unavailable. This, combined with the dual‑network complexity, made it hard to know the state of SIT/UAT or which versions were deployed.
There was no clear process for environment usage, no dedicated testing team, and no ownership of environments, leading to many demands being marked as completed but unable to be accepted.
The solution is straightforward: define standard SIT, UAT, and Production environments, automate their provisioning, and implement automated health checks. Although the environment cleanup took two weeks per sub‑team due to various odd issues, the automation provides three quality standards for an environment to be considered "Ready":
Ensure test data isolation so manual verification and automated tests do not interfere with each other.
Provide an automated mechanism that guarantees non‑interference among teams and allows quick resolution of conflicts without manual coordination.
Enable rapid automated reconstruction of the environment after it is broken by test data or code changes.
Only with these standards can dozens of engineers deliver a high‑quality integrated platform. Achieving a one‑click creation or repair of SIT/UAT may still require time, but knowing the problem guarantees a solution.
Conclusion : To achieve continuous delivery, testability must be considered from the first line of code, covering architecture support for easy testing across all environments, automated preparation of development and test environments, and isolation of environments used for automated versus manual testing.
This aligns with Chapter 5 of "Continuous Delivery 2.0", which emphasizes "testability" as a prerequisite for rapid, safe releases.
Continuous Delivery 2.0
Tech and case studies on organizational management, team management, and engineering efficiency
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.