Why Do MySQL PhantomReferences Cause Long GC Pauses and How to Fix Them?

This article analyzes frequent timeout alerts caused by the getUiToken API, investigates JVM garbage‑collection pauses linked to excessive PhantomReference objects from MySQL connections, and presents configuration, code, and scheduling solutions that dramatically reduce GC latency and improve service stability.

JD Cloud Developers
JD Cloud Developers
JD Cloud Developers
Why Do MySQL PhantomReferences Cause Long GC Pauses and How to Fix Them?

Background

Online applications frequently trigger timeout alerts (timeout = 1 s). The getUiToken interface returned error code “-1” 4037 times (failure description: business request exception), exceeding the threshold of 50. The current failure rate is 0 %, average response time is 150 ms, TP50 is 2 ms, TP90 is 896 ms, TP99 is 1024 ms, TP999 is 1152 ms, and the maximum is 1280 ms.

Environment Information

Server configuration: Linux 4c8g standard machine

JVM parameters:

-server -Djava.library.path=/usr/local/lib -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/export/log -Djava.awt.headless=true -Dsun.net.client.defaultConnectTimeout=60000 -Dsun.net.client.defaultReadTimeout=60000 -Djmagick.systemclassloader=no -Dnetworkaddress.cache.ttl=300 -Dsun.net.inetaddr.ttl=300 -Xms5G -Xmx5G -XX:+UseG1GC -XX:G1HeapRegionSize=4m -Xloggc:/export/log/${APP_NAME}/gc_detail.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=10m -XX:MaxTenuringThreshold=15 -XX:+PrintTenuringDistribution -XX:+PrintHeapAtGC

Interface traffic:

traffic chart
traffic chart

Problem Investigation

Code analysis shows getUiToken only performs a simple in‑memory data fetch and return, without any time‑consuming operations.

Using the SGM monitoring platform, the request latency is mainly spent in waiting, not in actual processing.

Since the business code is lightweight, the suspicion turned to garbage collection. GC logs reveal a high number of young GCs (4227) with an average interval of about 20 seconds and a maximum pause of 1.25 seconds. Although overall throughput is 99.3 % and full GCs are zero, the long Ref‑Proc pauses are abnormal.

Ref‑Proc handles soft, weak, phantom, final, JNI, and other reference types.

Enabling -XX:+PrintReferenceGC helps identify which reference type causes the longest pauses. The analysis shows most of the time is spent on PhantomReference objects.

Heap dump analysis with MAT shows about 4340 phantom reference objects, matching the GC log count.

Root Cause

The MySQL driver ( mysql‑connector‑java 5.1.44) creates a ConnectionPhantomReference for every new connection and stores it in a static ConcurrentHashMap. Even with Druid connection pool (version 1.0.15), new connections are still created because the default pool configuration sets connection idle timeout to 30 minutes and minimum idle threads to 0, causing continuous growth of the phantom reference set.

public class NonRegisteringDriver implements java.sql.Driver {
    protected static final ConcurrentHashMap<ConnectionPhantomReference, ConnectionPhantomReference> connectionPhantomRefs = new ConcurrentHashMap<>();
    protected static final ReferenceQueue<ConnectionImpl> refQueue = new ReferenceQueue<>();
    protected static void trackConnection(Connection newConn) {
        ConnectionPhantomReference phantomRef = new ConnectionPhantomReference((ConnectionImpl) newConn, refQueue);
        connectionPhantomRefs.put(phantomRef, phantomRef);
    }
}

How to Solve

1. Optimize Druid pool settings: increase minimum idle threads, raise max active connections, and extend connection keep‑alive time to 59 minutes (MySQL wait_timeout defaults to 3600 s) to reduce new connection creation.

spring:
  datasource:
    url: jdbc:mysql://xxxx?useUnicode=true&characterEncoding=utf8&allowMultiQueries=true&serverTimezone=GMT+8
    username: xxxx
    password: xxxx
    driver-class-name: com.mysql.jdbc.Driver
    type: com.alibaba.druid.pool.DruidDataSource
    druid:
      minIdle: 4
      maxActive: 10
      initialSize: 4
      testWhileIdle: true
      testOnBorrow: false
      testOnReturn: false
      validationQuery: select 1
      timeBetweenEvictionRunsMillis: 60000
      minEvictableIdleTimeMillis: 3540000

2. Enable parallel reference processing to shorten Ref‑Proc pauses: -XX:+ParallelRefProcEnabled 3. Periodically clean the phantom reference map when it grows large (e.g., > 500 entries) using a scheduled task:

@Component
public class CleanPhantomRefsSchedule {
    private static final ScheduledExecutorService CLEANER_EXECUTOR = Executors.newSingleThreadScheduledExecutor(r -> {
        Thread t = new Thread(r, "mysql-phantom-ref-cleaner");
        t.setDaemon(true);
        return t;
    });

    @PostConstruct
    public void doTask() {
        try {
            Field field = NonRegisteringDriver.class.getDeclaredField("connectionPhantomRefs");
            field.setAccessible(true);
            CLEANER_EXECUTOR.scheduleAtFixedRate(() -> {
                try {
                    Map<?, ?> refs = (Map<?, ?>) field.get(null);
                    if (refs != null && refs.size() > 500) {
                        refs.clear();
                        log.info("Cleared MySQL phantom references (count={})", refs.size());
                    }
                } catch (Exception e) {
                    log.error("connectionPhantomRefs clear error!", e);
                }
            }, 1, 1, TimeUnit.HOURS);
        } catch (NoSuchFieldException e) {
            throw new IllegalStateException("Failed to initialize MySQL phantom refs field", e);
        }
    }

    @PreDestroy
    void shutdown() {
        CLEANER_EXECUTOR.shutdownNow();
    }
}

4. Upgrade mysql‑connector‑java to 8.0+ and disable phantom reference generation with the JVM flag -Dcom.mysql.cj.disableAbandonedConnectionCleanup=true if compatibility permits.

Verification of Optimizations

After applying the code and configuration changes, the service was redeployed and observed for a day. GC logs showed the maximum pause reduced from 1.25 s to 0.1 s, young GC frequency improved from every ~20 seconds to once every 6 minutes, and PhantomReference processing time dropped to 0.0001966 s. No further timeout alerts appeared and the system responded normally, confirming the effectiveness of the optimizations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaJVMPerformanceConnectionPoolMySQLgcPhantomReference
JD Cloud Developers
Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.