Operations 10 min read

Improving Iteration Outcomes with Cumulative Flow Diagrams and Environment Readiness in a Complex Microservice System

The article analyzes how applying cumulative flow diagram insights, tightening iteration commitments, and establishing standardized, automated SIT/UAT environments dramatically reduced risks, overtime, and untested deployments in a multi‑team microservice platform, highlighting the importance of testability and operational readiness for continuous delivery.

Continuous Delivery 2.0

Apr 13, 2022

Improving Iteration Outcomes with Cumulative Flow Diagrams and Environment Readiness in a Complex Microservice System

About half a month ago an article discussed using cumulative flow diagrams to discover team execution problems. In the following iteration the team applied three recommendations: reduce the amount of work per iteration and involve developers in the next iteration’s demand preparation; schedule multiple quality acceptance points (e.g., Wednesdays and Fridays) with product, delivery managers and possibly developers; and fix bugs as early as possible.

While the focus is not on why developers should join demand analysis, the post shows the results after the improvements. Two cumulative flow diagrams compare the previous and the current iteration, revealing that the team adhered to the agreements, exposed risks early, and reduced overtime before the iteration ended.

However, the new iteration also showed fewer UAT‑deployed demands and almost no "project‑wide acceptance". The root cause was identified through daily stand‑ups: the environment preparation workflow was broken.

Dialogue example

Product acceptance person: Yesterday’s demand cannot be accepted because module M (a microservice) is not yet deployed to SIT.

Sub‑domain B: The bug you reported is actually caused by module N not having the latest code in UAT.

Sub‑domain C: My part is finished, but sub‑domain D’s module is not ready, so acceptance is impossible.

Sub‑domain D: Our feature is blocked by an external team’s dependency; they need two more days to release.

Sub‑domain A: Our service runs on the development network and cannot reach the production network, so we must devise a new solution.

Sub‑domain C: The bug you mentioned comes from a new version of sub‑domain P’s module.

Sub‑domain A: The bug is actually a mismatch in data‑interface format; the original was XXX, now it must be YYY.

The system is a complex microservice‑based integration platform with four collaborating sub‑domains, six subsystems, dozens of microservices, deployments across two networks (development and production), and each subsystem needing integration with other platforms.

注：
上图并不包含类似于服务发现之类的公共标准依赖服务，以及更底层的数据库基础设施服务等。
图中的外部依赖全部为业务依赖，即：需要这些依赖服务提供业务输出。

The testing environment is equally complex because parts of the system are deployed on both the development and production networks, requiring test environments to span both networks as well.

Initially, the team followed old habits: some developers created feature branches and deployed directly to SIT for debugging, while others used UAT because SIT was unavailable. This, combined with the dual‑network complexity, made it hard to know the state of SIT/UAT or which versions were deployed.

There was no clear process for environment usage, no dedicated testing team, and no ownership of environments, leading to many demands being marked as completed but unable to be accepted.

The solution is straightforward: define standard SIT, UAT, and Production environments, automate their provisioning, and implement automated health checks. Although the environment cleanup took two weeks per sub‑team due to various odd issues, the automation provides three quality standards for an environment to be considered "Ready":

Ensure test data isolation so manual verification and automated tests do not interfere with each other.

Provide an automated mechanism that guarantees non‑interference among teams and allows quick resolution of conflicts without manual coordination.

Enable rapid automated reconstruction of the environment after it is broken by test data or code changes.

Only with these standards can dozens of engineers deliver a high‑quality integrated platform. Achieving a one‑click creation or repair of SIT/UAT may still require time, but knowing the problem guarantees a solution.

Conclusion : To achieve continuous delivery, testability must be considered from the first line of code, covering architecture support for easy testing across all environments, automated preparation of development and test environments, and isolation of environments used for automated versus manual testing.

This aligns with Chapter 5 of "Continuous Delivery 2.0", which emphasizes "testability" as a prerequisite for rapid, safe releases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

devops continuous delivery environment management Cumulative Flow Diagram

Written by

Continuous Delivery 2.0

Tech and case studies on organizational management, team management, and engineering efficiency

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.