Optimizing Recommendation Service Response Time via Log Level Configuration and Code Refactoring

This article describes how to reduce recommendation service response time by refactoring code, configuring log levels per deployment group, using environment variables, and diagnosing performance issues through GC analysis, thread monitoring, and memory leak detection, achieving an average latency reduction of over 30 ms.

58 Tech
58 Tech
58 Tech
Optimizing Recommendation Service Response Time via Log Level Configuration and Code Refactoring

Background – In recommendation scenarios, sub‑200 ms response times are critical for user experience, but backend algorithm growth and service complexity make this challenging. Adjusting log levels per group offers a compromise for performance tuning.

Scenario – While refactoring the 58 Tongzhen recommendation service, the team explored log‑level based optimizations using SCF (a Service Communication Framework similar to Dubbo), Log4j, and JDK 8.

Key Optimizations

Removed unnecessary Lambda expressions to simplify collections.

Prefer direct getter/setter over generic copy utilities (CGLib, reflection) for better performance.

Shift heavy real‑time algorithm calculations to offline preprocessing.

Cache static whitelist/blacklist data in Guava memory cache with expiration, falling back to Redis.

Leverage multithreading to parallelize candidate recall, ranking, and fusion models.

Avoid long method names starting with get that can trigger unintended reflective calls in frameworks like FastJSON.

Reduce payload size and number of service calls (e.g., transmit only IDs instead of full content).

Log‑level configuration was introduced to avoid repetitive code changes and to enable rapid troubleshooting.

Configuration Steps

Group deployment machines in the cloud management platform.

Assign distinct log levels to each group.

Set an environment variable (e.g., logLevel).

Read the variable in the application.

Set the global log level programmatically.

Code Samples

String logLevel = System.getenv("logLevel");
import org.apache.log4j.Level;
import org.apache.log4j.LogManager;
import org.apache.log4j.Logger;
LogManager.getRootLogger().setLevel(realLevel);
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
private static Logger log = LoggerFactory.getLogger(MyClass.class);

When issues arise, the team used standard Linux/JVM tools for diagnosis:

jstack -l pid >> /tmp/1.log
jstat -gc pid
jmap -heap pid
jmap -dump:live,format=b,file=/opt/heap2.bin pid

GC analysis revealed frequent Full GC events causing latency spikes. Memory analysis with Eclipse MAT identified a large Guava LoadingCache that leaked references, leading to OOM risk.

public LoadingCache<String, Map<String, List<Double>>> contentVecCache = CacheBuilder.newBuilder()
    .maximumSize(40000)
    .expireAfterWrite(2, TimeUnit.HOURS)
    .build(new CacheLoader<String, Map<String, List<Double>>>() {
        @Override
        public Map<String, List<Double>> load(String key) {
            String itemId = key.split("_")[0];
            int dataSource = Integer.valueOf(key.split("_")[1]);
            Map<String, List<Double>> map = new HashMap<>();
            map.put("title", wordEmbeddingService.getContentEmbedding(itemId, dataSource, EmbedContentType.TITLE));
            map.put("keyword", wordEmbeddingService.getContentEmbedding(itemId, dataSource, EmbedContentType.KEYWORD));
            return map;
        }
    });

Fixes included switching to Log4j2/Slf4j, adding static to logger‑related methods, and configuring buffered I/O for appenders:

<appender name="MyLog" class="org.apache.log4j.DailyRollingFileAppender">
    <param name="File" value="/data/logs/feeds/log_info.log" />
    <param name="encoding" value="UTF-8" />
    <param name="DatePattern" value="'.'yyyy.MM.dd" />
    <param name="Append" value="true" />
    <param name="BufferSize" value="8192" />
    <param name="ImmediateFlush" value="false" />
    <param name="BufferedIO" value="true" />
</appender>

Experience Summary

Average response time of the recommendation SCF interface dropped by >30 ms and timeout counts decreased.

The log‑level configuration concept applies to any logging framework, not just Log4j.

Environment variables can be stored anywhere accessible to the application (files, Redis, etc.).

When encountering production incidents, remain calm, prioritize service recovery, then perform systematic root‑cause analysis.

Prefer Slf4j or Log4j2 to avoid known dead‑lock issues in Log4j.

References

https://blog.csdn.net/zl378837964/article/details/84884934

https://yq.aliyun.com/articles/271448

Author – Yang Chunjian, Senior Java Engineer at 58 SLG Traffic Intelligence Department.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

javaperformance optimizationmemory leakgcSCFService Response Time
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.