Optimizing Java String Usage: Avoid +=, Use StringBuilder, and Leverage intern()
This article analyzes the immutable nature of Java's String class, explains why direct concatenation with "+=" is inefficient, and demonstrates three optimization techniques—using StringBuilder, applying String.intern(), and careful use of Split—to dramatically improve performance and reduce memory consumption.
String Characteristics
Understanding the source code of java.lang.String (based on JDK 1.8) reveals that the internal value[] array is declared final , making the String object immutable; once created, its content cannot be changed.
1. Do Not Use "+=" for Concatenation
Because String is immutable, each "+=" operation creates a new String object, leading to high CPU and memory cost. The article provides a benchmark comparing a naive concatenation method with a StringBuilder implementation.
public static String doAdd() {
String result = "";
for (int i = 0; i < 10000; i++) {
result += (" i:" + i);
}
return result;
}
public static String doAppend() {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 10000; i++) {
sb.append(" i:" + i);
}
return sb.toString();
}Benchmark results show the StringBuilder version runs in about 1 ms while the "+=" version takes several hundred milliseconds.
2. Make Good Use of String.intern()
The intern() method stores a single canonical copy of a string in the JVM's string pool, allowing repeated values to share the same reference and thus saving memory. The article cites a Twitter case where using intern() reduced address‑related memory from 20 GB to a few hundred megabytes.
SharedLocation sharedLocation = new SharedLocation();
sharedLocation.setCity(messageInfo.getCity().intern());
sharedLocation.setCountryCode(messageInfo.getRegion().intern());
sharedLocation.setRegion(messageInfo.getCountryCode().intern());3. Use Split() with Caution
Calling String.split() often relies on regular expressions, whose back‑tracking engine can cause severe CPU spikes for complex patterns. The article demonstrates a pathological regex that leads to high CPU usage and recommends using indexOf() or other simpler parsing techniques when possible.
String badRegex = "^([hH][tT]{2}[pP]://|[hH][tT]{2}[pP][sS]://)(([A-Za-z0-9-~]+).)+([A-Za-z0-9-~\\/])+$";
String bugUrl = "http://www.apigo.com/dddp-web/pdf/download?request=...";
if (bugUrl.matches(badRegex)) {
System.out.println("match!!");
} else {
System.out.println("no match!!");
}Conclusion
The article summarizes three practical ways to optimize Java strings: avoid direct "+=" concatenation, use StringBuilder for mutable building, apply String.intern() for highly repeated literals, and be cautious with Split() due to regex back‑tracking. These techniques together can significantly improve runtime performance and reduce memory footprint.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.