Optimizing Recommendation Service Response Time via Log Level Configuration and Code Refactoring
This article describes how to reduce recommendation service response time by refactoring code, configuring log levels per deployment group, using environment variables, and diagnosing performance issues through GC analysis, thread monitoring, and memory leak detection, achieving an average latency reduction of over 30 ms.
Background – In recommendation scenarios, sub‑200 ms response times are critical for user experience, but backend algorithm growth and service complexity make this challenging. Adjusting log levels per group offers a compromise for performance tuning.
Scenario – While refactoring the 58 Tongzhen recommendation service, the team explored log‑level based optimizations using SCF (a Service Communication Framework similar to Dubbo), Log4j, and JDK 8.
Key Optimizations
Removed unnecessary Lambda expressions to simplify collections.
Prefer direct getter/setter over generic copy utilities (CGLib, reflection) for better performance.
Shift heavy real‑time algorithm calculations to offline preprocessing.
Cache static whitelist/blacklist data in Guava memory cache with expiration, falling back to Redis.
Leverage multithreading to parallelize candidate recall, ranking, and fusion models.
Avoid long method names starting with get that can trigger unintended reflective calls in frameworks like FastJSON.
Reduce payload size and number of service calls (e.g., transmit only IDs instead of full content).
Log‑level configuration was introduced to avoid repetitive code changes and to enable rapid troubleshooting.
Configuration Steps
Group deployment machines in the cloud management platform.
Assign distinct log levels to each group.
Set an environment variable (e.g., logLevel ).
Read the variable in the application.
Set the global log level programmatically.
Code Samples
String logLevel = System.getenv("logLevel"); import org.apache.log4j.Level;
import org.apache.log4j.LogManager;
import org.apache.log4j.Logger;
LogManager.getRootLogger().setLevel(realLevel); import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
private static Logger log = LoggerFactory.getLogger(MyClass.class);When issues arise, the team used standard Linux/JVM tools for diagnosis:
jstack -l pid >> /tmp/1.log
jstat -gc pid
jmap -heap pid
jmap -dump:live,format=b,file=/opt/heap2.bin pid
GC analysis revealed frequent Full GC events causing latency spikes. Memory analysis with Eclipse MAT identified a large Guava LoadingCache that leaked references, leading to OOM risk.
public LoadingCache
>> contentVecCache = CacheBuilder.newBuilder()
.maximumSize(40000)
.expireAfterWrite(2, TimeUnit.HOURS)
.build(new CacheLoader
>>() {
@Override
public Map
> load(String key) {
String itemId = key.split("_")[0];
int dataSource = Integer.valueOf(key.split("_")[1]);
Map
> map = new HashMap<>();
map.put("title", wordEmbeddingService.getContentEmbedding(itemId, dataSource, EmbedContentType.TITLE));
map.put("keyword", wordEmbeddingService.getContentEmbedding(itemId, dataSource, EmbedContentType.KEYWORD));
return map;
}
});Fixes included switching to Log4j2/Slf4j, adding static to logger‑related methods, and configuring buffered I/O for appenders:
<appender name="MyLog" class="org.apache.log4j.DailyRollingFileAppender">
<param name="File" value="/data/logs/feeds/log_info.log" />
<param name="encoding" value="UTF-8" />
<param name="DatePattern" value="'.'yyyy.MM.dd" />
<param name="Append" value="true" />
<param name="BufferSize" value="8192" />
<param name="ImmediateFlush" value="false" />
<param name="BufferedIO" value="true" />
</appender>Experience Summary
Average response time of the recommendation SCF interface dropped by >30 ms and timeout counts decreased.
The log‑level configuration concept applies to any logging framework, not just Log4j.
Environment variables can be stored anywhere accessible to the application (files, Redis, etc.).
When encountering production incidents, remain calm, prioritize service recovery, then perform systematic root‑cause analysis.
Prefer Slf4j or Log4j2 to avoid known dead‑lock issues in Log4j.
References
https://blog.csdn.net/zl378837964/article/details/84884934
https://yq.aliyun.com/articles/271448
Author – Yang Chunjian, Senior Java Engineer at 58 SLG Traffic Intelligence Department.
58 Tech
Official tech channel of 58, a platform for tech innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.