Essential TDH Community Edition Ops Guide: Common Issues & Solutions
This guide details practical solutions for TDH Community Edition operational problems, covering component configuration paths, log locations, disk‑space alerts, service start/stop via kubectl, time synchronization, network/firewall checks, hostname resolution, and license service management.
Common Operational Issues
Component configuration files are located at /etc/<component>/conf and component JAR packages at /usr/lib/<component>.
Log Locations
Component logs can be found under /var/log/<component_instance>, where <component_instance> is the instance name.
Adjust Manager Log Size and Count
Edit /etc/transwarp-manager/master/log4j.properties and modify the *.MaxFileSize (log file size) and *.MaxBackupIndex (number of retained logs) parameters.
Disk Space Warning
When the Manager UI shows a low‑disk‑space alert, check root usage with du -h --max-depth=1|grep G and clean large directories such as /var/log, /hadoop/data, /ngmr, or move files to other disks.
Start/Stop Services via kubectl (Example: Inceptor)
# Check Inceptor service status
kubectl get po -owide | grep inceptor
# Stop Inceptor Metastore role
kubectl delete -f /var/lib/transwarp-manager/master/content/resources/services/inceptor1/inceptor-metastore.yaml
# Start Inceptor Metastore role
kubectl create -f /var/lib/transwarp-manager/master/content/resources/services/inceptor1/inceptor-metastore.yaml
# Restart Metastore role
kubectl delete pod <inceptor_metastore_pod>Notes: start/stop must be executed on the Manager node; restart can be run on any node using the -s flag to specify the TOS Server IP.
Health‑Check Time Offset Alert
The alert indicates unsynchronized node clocks; synchronize the time across nodes.
Manually Sync Node Time
systemctl stop ntpd
ntpdate <NTP Server hostname>
systemctl start ntpdDisk Write Failure
# Check disk space
df -h
# If full, migrate data
# Check disk health
smartctl -H <target_disk>
# Check read speed
hdparm -Tt <target_disk>
# Check inode usage
df -iIf inode usage is saturated, delete unnecessary temporary files to free inodes.
Connection Refused in Logs
Verify network connectivity between nodes, ensure /etc/hosts entries are correct, and check that the firewall is disabled. Use systemctl status firewalld, then systemctl stop firewalld and systemctl disable firewalld if needed.
Component Link Unreachable
Make sure the URL uses the node’s hostname and that the hostname‑IP mapping is present in the local hosts file (Windows: C:\Windows\system32\drivers\etc\hosts; Linux: /etc/hosts).
Manager Cannot Connect to Agent (Ask Timeout)
# Test connectivity to agent hostname
nc -v <agent_hostname> 10208
# Test connectivity to agent IP
nc -v <agent_ip> 10208
# Ping the agent
ping <agent_ip>
# Check firewall status
systemctl status firewalld
# Stop and disable firewall if active
systemctl stop firewalld
systemctl disable firewalldIf the firewall is off but the timeout persists, increase remote-start-timeout in /etc/transwarp-manager/master/application.conf to 60 seconds and restart the Manager.
Can Hostname Be Modified After Adding a Node?
No. Hadoop identifies nodes by hostname/FQDN; changing it is equivalent to removing the node and adding a new one.
License Service Issues
In the Manager UI, go to Management → License to view service status; if the service is down (e.g., more than two License Server nodes are offline), click “Start”. The license expiration date is also shown on this page.
Summary
This article covers common TDH Community Edition operational scenarios, including configuration file locations, log management, disk‑space alerts, service start/stop via kubectl, time synchronization, disk health checks, network and firewall troubleshooting, hostname resolution, and license service handling.
StarRing Big Data Open Lab
Focused on big data technology research, exploring the Big Data era | [email protected]
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
