A Generic State Machine Solution for Managing Business Entity Lifecycles
This article presents a comprehensive state‑machine‑based approach for managing the lifecycle of business entities such as orders and work orders, detailing core pain points, essential questions a state machine must answer, a comparative analysis of four implementation options, and a recommended solution that combines a database transition table, domain services, and optimistic‑lock concurrency control, along with architecture diagrams, code snippets, and operational guidelines.
Problem Definition
Business entities such as orders, work orders, and approval documents suffer from illegal state transitions, scattered state logic, inconsistent states, difficult extensions, lack of audit, and concurrency conflicts when state changes are not centrally controlled.
Core insight: State fields are ordinary database columns without first‑class status management.
What a State Machine Must Answer
Legal transition definition
Guard conditions
Actions and side effects
Atomicity of state change + side effects
Concurrency safety
Observability
Solution Options
Four approaches were compared:
Spring Statemachine – high implementation complexity, medium distributed friendliness.
Database + domain service – low complexity, low transformation cost, optimistic‑lock concurrency safety, good audit support.
Event Sourcing – very high complexity, natural concurrency support, built‑in audit.
Camunda workflow engine – high complexity, good distributed support.
The recommended solution is the second option: a transition table in the database, a domain service, and optimistic‑lock CAS updates.
Overall Architecture
Three layers:
Business call layer : services invoke StateMachineService.fire() to trigger state changes.
State machine engine layer : validates transitions, executes guards, performs CAS updates, and dispatches side effects.
Data storage layer : stores definition, transition rules, and audit logs.
Key tables:
state_machine_definition state_transition state_change_logBusiness entity table with added columns state, state_version,
state_updated_atCore Engine Flow (fire())
Optimistic‑lock query the entity by ID and expected version.
Load transition rules for the current state and event.
Evaluate guard conditions in priority order.
Perform CAS update of state and version.
Insert an audit log record.
Dispatch side effects (MQ, local message table, notifications).
Return a StateChangeResult containing old state, new state, event, and new version.
Guard Mechanism
Three guard types are supported:
BEAN – Spring bean reference, debuggable and unit‑testable.
SPEL – SpEL expression evaluated at runtime, flexible but harder to debug.
NONE – No guard, any transition allowed.
Recommended hybrid strategy: use BEAN for complex checks, SPEL for simple conditions, and whitelist functions to avoid arbitrary code execution.
Concurrency Safety
The state_version field provides optimistic‑lock protection. If the CAS update affects zero rows, a ConcurrentConflictException is thrown. Callers may retry up to three times before surfacing an error.
Side‑Effect Strategies
Three consistency models are compared:
Strong (synchronous) – state change and side effect occur in the same transaction; highest reliability but long transactions.
Eventual (asynchronous MQ) – side effects are sent after transaction commit; low latency, MQ retry handles failures.
Reliable event (local outbox) – state and message are persisted together; a scheduler scans and delivers, providing exactly‑once delivery.
Key principle: side‑effect failure does not roll back the state unless strong consistency is required.
Timeout Mechanism
Automatic state timeout (e.g., cancel after 30 minutes) can be implemented by:
RocketMQ delayed messages – second‑level precision, no scanning overhead.
XXL‑Job scheduled tasks – minute‑level tolerance, simple but incurs scan delay.
In‑process time wheel (HashedWheelTimer) – low latency for single‑node scenarios, not suitable for distributed persistence.
Recommended primary use of RocketMQ delayed messages with XXL‑Job as a fallback for missed deliveries.
Rollback and Compensation
Rollback is treated as another state transition using the same fire() flow. When side effects have already executed (e.g., payment deducted), a compensation transaction is performed – for example, creating a refund record instead of trying to undo the original payment.
Integration Guide
Steps to onboard a new entity:
Define a state enum for the entity.
Insert a record into state_machine_definition.
Configure transition rules in state_transition.
Make the entity implement the StatefulEntity interface.
Replace direct state updates with calls to StateMachineService.fire().
For existing entities, add the three state columns, migrate current status values, configure rules, replace direct updates, and optionally add a DAO interceptor to forbid raw UPDATE state statements.
Observability & Operations
Audit queries retrieve the full change history of an entity. Example SQL snippets are provided for full logs and state‑stay‑time analysis.
Key monitoring metrics include: state.fire.success.count – successful transitions. state.fire.fail.count – failed transitions (alert > 100/min). state.fire.illegal.count – illegal transition attempts (alert > 10/min). state.fire.concurrent.count – optimistic‑lock conflicts (alert > 50/min). state.fire.duration.ms – P99 latency (alert if > 500 ms). state.outbox.pending.count – pending local messages (alert > 1000).
Risks & Mitigations
Incorrect transition configuration → add admin rule pre‑check feature.
Guard exceptions causing transaction rollback → guard implementations catch all exceptions and return false with a logged reason.
Side‑effect consumption failure → use local outbox with retry, alert, and manual compensation.
Frequent optimistic‑lock conflicts → automatic retry (max 3) before reporting error.
Business code bypassing fire() → DAO layer interceptor, coding standards, and code review enforcement.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect-Kip
Daily architecture work and learning summaries. Not seeking lengthy articles—only real practical experience.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
