Recovering OCP Access After NLB Failure: Step-by-Step Commands
This guide explains how to recover a multi-node OCP cluster after a failed NLB upgrade by diagnosing METADB connection errors, extracting VIP/PORT information, and re-creating NLB load-balancing rules through Docker and nlbcli commands, ensuring the cluster becomes reachable again.
Background
Upgrading glibc on the host running NLB failed, making the NLB host unreachable and causing a multi‑node OCP cluster that uses NLB for load balancing to report errors. A new NLB node must be installed and the load balancer reconfigured.
OCP Architecture
Multi‑node OCP with NLB load balancing.
Procedure
When opening the OCP login page, an exception is thrown before credentials are entered.
Error Message
Unhandled exception, type=CannotCreateTransactionException, message=Could not open JPA EntityManager for transactionError Screenshot
Suspected Cause
The OCP backend METADB cannot be reached, leading to the UI error.
Confirming the Cause
# Verify OCP backend METADB login
docker exec -it ocp bash
obclient -u${OCP_METADB_USER} -p${OCP_METADB_PASSWORD} -h${OCP_METADB_HOST} -P${OCP_METADB_PORT} -AcThe METADB login fails.
Solution
Install a new NLB instance (e.g., using OAT). Detailed steps are omitted.
Obtain the VIP and PORT that the NLB provides for METADB access. On any OCP node:
# Enter OCP container
docker exec -it ocp bash
# Show environment variables for METADB
env | grep -E 'OCP_METADB_HOST|OCP_METADB_PORT'Known METADB backend nodes:
10.186.65.4:2883
10.186.65.5:2883
10.186.65.6:2883The NLB should expose a virtual IP and port, e.g., 10.186.65.250:3307.
# Register METADB backend rule in NLB
docker exec -it nlb bash
nlbcli rule create tcp 3307 '10.186.65.4:2883,10.186.65.5:2883,10.186.65.6:2883' roundrobinKnown OCP node ports:
10.186.65.4:8080
10.186.65.5:8080
10.186.65.6:8080The NLB should expose a virtual IP and port for OCP, e.g., 10.186.65.250:12345.
# Register OCP load‑balancing rule in NLB
nlbcli rule create http 12345 '10.186.65.4:8080,10.186.65.5:8080,10.186.65.6:8080' chashList configured NLB rules to verify:
# List NLB rules
nlbcli rule listValidate that OCP can access METADB and the UI loads successfully.
Conclusion
If the NLB container is lost, the load‑balancing configuration can be rebuilt using the command‑line steps above.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
