Achieve Zero‑Downtime Deployments with Jenkins and Docker: A Complete CI/CD Blueprint
This article presents a fully automated, zero‑downtime deployment solution that integrates Jenkins pipelines with a set of shell scripts to handle code checkout, image building, seamless container switching, health checks, rollback, disk monitoring, and resource cleanup, providing a reliable end‑to‑end workflow for fast and stable releases.
Background Problem Analysis
Original issues:
1. /dev/vda1 usage 99% (50G/50G) – disk space exhausted
2. /dev/vdb only 4% used (3.2G/99G) – severe waste
3. Historical images and containers not cleaned after deployment
4. Deployment causes service interruptionRoot causes:
- Project deployed under /home/user/ai, occupying the root partition
- No automatic cleanup mechanism
- Deployment stops old containers before starting new ones
- Docker data may also reside on the root partitionOverall Architecture Design
Core Components
ai_cicd.sh : Main deployment script handling code fetch, image build and container launch
start.sh : Smart start script supporting incremental builds and zero‑downtime switching
cleanup_docker.sh : Periodic resource‑cleanup script for redundant images and containers
check_disk_space.sh : Disk‑monitoring script to prevent storage exhaustion
Deployment Process Overview
Jenkins Integration
Jenkins Pipeline Configuration
pipeline {
agent any
parameters {
choice(name: 'PROJECT_NAME', choices: ['ai_xxx_analysis','qa_xxx_server','ai_xxx'], description: 'Select project')
choice(name: 'BRANCH', choices: ['main','develop','test'], description: 'Select branch')
booleanParam(name: 'CHECK_DISK', defaultValue: true, description: 'Check disk space before deployment')
}
stages {
stage('Disk Space Check') {
when { expression { params.CHECK_DISK } }
steps { script { sh '''cd /opt/ai
bash check_disk_space.sh''' } }
}
stage('Execute Deployment') {
steps { script { sh """cd /opt/ai
bash ai_cicd.sh ${params.PROJECT_NAME} ${params.BRANCH}""" } }
}
stage('Validate Deployment') {
steps { script {
sleep 10
def containerName = params.PROJECT_NAME.replace('-', '_')
sh """docker ps -f name=${containerName} --format '{{.Status}}' | grep 'Up'"""
} }
}
}
post {
success { echo "✅ Deployment succeeded! Project: ${params.PROJECT_NAME}, Branch: ${params.BRANCH}" }
failure { echo "❌ Deployment failed, check logs" }
}
}Jenkins Credential Configuration
Docker permissions : Add the Jenkins user to the Docker group
Directory permissions : Ensure /opt/ai is readable/writable by Jenkins
Git credentials : Configure access tokens for the Git repository
Core Script Details
ai_cicd.sh – Main Deployment Flow
Key Logic
# 1. Code fetch
git fetch --all
git reset --hard origin/$BRANCH
git pull origin $BRANCH
# 2. Build new image with timestamp tag
VERSION_TAG=$(date +%Y%m%d_%H%M%S)
NEW_IMAGE_NAME="${PROJECT_NAME}:${VERSION_TAG}"
docker build -t $NEW_IMAGE_NAME -t ${PROJECT_NAME}:latest .
# 3. Zero‑downtime switch
OLD_CONTAINER_ID=$(docker ps -q -f name=$CONTAINER_NAME)
if [ -n "$OLD_CONTAINER_ID" ]; then
docker stop $CONTAINER_NAME # free port
fi
docker run -d --name $CONTAINER_NAME --network host $NEW_IMAGE_NAME
# 4. Health check
MAX_WAIT=60
WAITED=0
while [ $WAITED -lt $MAX_WAIT ]; do
HEALTH_STATUS=$(docker inspect --format='{{.State.Health.Status}}' $CONTAINER_NAME)
if [ "$HEALTH_STATUS" = "healthy" ]; then
docker rm $OLD_CONTAINER_ID # remove old container
break
fi
sleep 2
WAITED=$((WAITED+2))
done
# 5. Schedule cleanup task
(sleep 60 && bash cleanup_docker.sh $PROJECT_NAME) &start.sh – Incremental Build Check
check_build_needed() {
# 1. Image existence
if [ -z "$(docker images -q $IMAGE_NAME)" ]; then
echo "true:true:Image missing"
return
fi
# 2. requirements.txt changes
current_req_hash=$(md5sum requirements.txt | cut -d' ' -f1)
if [ "$(cat $LAST_REQ_HASH_FILE)" != "$current_req_hash" ]; then
echo "true:true:Dependencies changed"
return
fi
# 3. Code changes
current_commit=$(git rev-parse HEAD)
if [ "$(cat $LAST_COMMIT_FILE)" != "$current_commit" ]; then
echo "true:false:Code changed"
return
fi
echo "false:false:No build needed"
}Zero‑Downtime Switching Strategy
# 1. Record old container
OLD_CONTAINER_ID=$(docker ps -q -f name=$CONTAINER_NAME)
# 2. Start new container on temporary port
if [ -n "$OLD_CONTAINER_ID" ]; then
TEMP_PORT=$((HOST_PORT+1))
docker run -d --name ${CONTAINER_NAME}_new -p $TEMP_PORT:$PORT $NEW_IMAGE
fi
# 3. After health check passes, replace old container
if [ "$health_status" = "healthy" ]; then
docker stop $CONTAINER_NAME
docker rm ${CONTAINER_NAME}_new
docker run -d --name $CONTAINER_NAME -p $HOST_PORT:$PORT $NEW_IMAGE
fiDisk Space Management
Monitoring & Alerts
# check_disk_space.sh core logic
WARNING_THRESHOLD=70
CRITICAL_THRESHOLD=85
VDA1_USAGE=$(df -h | grep '/dev/vda1' | awk '{print $5}' | sed 's/%//')
if [ "$VDA1_USAGE" -ge "$CRITICAL_THRESHOLD" ]; then
echo "❌ Critical: Disk usage $VDA1_USAGE%"
# Integrate alert channels here (e.g., DingTalk, email)
exit 1
fiCleanup Policies
Stopped containers – status=exited – delete all
Old image versions – non‑latest tags – keep the newest one
Dangling images – dangling=true – delete all
Log files – pattern *.log.* – retain for 7 days
Build cache – normal mode – retain for 24 h
Python __pycache__ – pycache – delete all
Practical Recommendations
Dockerfile Health‑Check Configuration
HEALTHCHECK --interval=5s --timeout=3s --start-period=30s --retries=10 \
CMD curl -f http://localhost:${PORT}/health || exit 1Jenkins Scheduled Cleanup
pipeline {
triggers { cron('0 2 * * *') }
stages {
stage('Docker Resource Cleanup') { steps { sh 'bash /opt/ai/cleanup_docker.sh --keep-images 2' } }
stage('Disk Space Check') { steps { sh 'bash /opt/ai/check_disk_space.sh' } }
}
}Image Version Management
# Timestamp tag
VERSION_TAG=$(date +%Y%m%d_%H%M%S)
# Or Git short commit
VERSION_TAG=$(git rev-parse --short HEAD)
docker build \
-t ${PROJECT_NAME}:${VERSION_TAG} \
-t ${PROJECT_NAME}:latest \
-t ${PROJECT_NAME}:${BRANCH} \
.Log Management
# Limit container log size
docker run \
--log-opt max-size=100m \
--log-opt max-file=3 \
$IMAGE_NAME
# Periodic log cleanup (keep 7 days)
find /opt/ai/*/logs -name "*.log.*" -mtime +7 -deleteCommon Troubleshooting
Typical Issues
Port conflict
# Check port usage
lsof -i :5000
# Force cleanup
docker stop $(docker ps -q -f name=ai_xxx)Disk space shortage
# Emergency cleanup
docker system prune -a -f
bash cleanup_docker.sh --aggressiveHealth‑check failure
# View recent logs
docker logs --tail 100 $CONTAINER_NAME
# Enter container for deeper inspection
docker exec -it $CONTAINER_NAME bashMonitoring Metrics
Deployment success rate
Deployment duration
Disk usage trends
Container health status
Image count changes
Conclusion
The four core scripts— ai_cicd.sh, start.sh, cleanup_docker.sh and check_disk_space.sh —provide a fully automated CI/CD pipeline with zero‑downtime deployment, automatic rollback, incremental builds, resource cleanup and disk monitoring. This approach is suitable for personal projects or isolated environments where fast, reliable releases are required.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
