How We Scaled Litmus CI: From 100 to 2,000 Daily Tasks
This article details how the Litmus code‑quality platform was integrated into a CI pipeline, the performance bottlenecks encountered with Jenkins and Sonar, the systematic optimizations applied—including server redesign, script refactoring, parallel testing, and JVM tuning—and the resulting dramatic reduction in task duration and increase in throughput.
What Is Litmus?
Litmus is a platform developed by Testing Efficiency to assess code quality, offering metrics such as code smells, duplicate code, complexity, unit‑test success rate, and coverage. These metrics are collected using the open‑source tools Sonar and JaCoCo.
Growth After CI Integration
After Litmus was integrated into the CI pipeline as a quality checkpoint, daily tasks rose from just over 100 to more than 1,000 and eventually approached 2,000, exposing performance limitations of the original Litmus deployment.
Initial Problems
Jenkins tasks remained pending for an excessively long time.
Sonar frequently crashed.
Sonar report generation took too long.
Sonar ran out of disk space.
Optimization Process
Reducing Jenkins Pending Time
We rebuilt the Jenkins infrastructure with a master‑slave architecture and installed the Kubernetes plugin. The master dynamically creates slave pods, and all jobs run on these slaves, each using a lightweight 2C4G pod template.
Performance testing showed that most backend jobs complete comfortably on a 2C4G configuration.
Optimizing Build Scripts
Pre‑build steps such as SSH‑key generation, Maven download, and extraction were moved into the Docker image used for builds, eliminating the long waiting period before code download.
Parallelizing Unit Tests and Sonar Scans
Analysis showed that the bulk of task time was spent on unit tests and Sonar analysis. Unit‑test duration matched local runs, while Sonar time grew linearly with code size. We re‑engineered the pipeline to run unit tests and Sonar scans in parallel.
Coverage reports are now generated by invoking JaCoCo directly from Litmus, requiring only source code, compiled classes, and the JaCoCo executable.
We also added a cleanup step to delete unnecessary JAR files from the build artifact, keeping only jacoco.exec, class files, and Java sources.
Sonar Optimizations
Our Sonar instance (Community Edition 7.9) runs in Docker. Initially it used a 32‑core/32 GB server, taking about 2 minutes per report and often causing long pending times.
Upgrading to a 56‑core/128 GB server reduced average report generation to under 10 seconds.
Disk usage grew to ~200 GB for Docker files and PostgreSQL data, eventually causing Sonar to stay pending when the disk filled.
Automatic database cleanup in Sonar proved ineffective, so we now delete old data manually via the Sonar UI or directly in PostgreSQL.
JVM and Elasticsearch Tuning
Sonar crashes were traced to Elasticsearch OOM errors. We increased the ES heap from 512 MB to 10 GB, eliminating OOM‑related crashes.
Multi‑Machine Sonar Mode
Because the Community Edition generates reports single‑threaded, we designed a multi‑machine mode that maps each task ID to a specific Sonar server, allowing concurrent analyses when needed.
Results
After optimization, average task duration dropped to under 10 minutes, compared with roughly 30 minutes before the changes.
Future Work
Frontend Coverage Integration
We plan to collect Jest coverage reports via Clover XML generated in Jenkins, standardizing node versions and resource allocation (8C16G) across projects.
Custom Sonar Rules
Developing custom Sonar rules will allow us to enforce company‑specific code‑smell checks and improve deployment safety.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
