Why MySQL getUiToken Times Out: PhantomReference GC Bottleneck & Fix

The article investigates frequent timeout alerts on the getUiToken API, tracing the issue to excessive PhantomReference objects causing long GC pauses, explains how MySQL’s AbandonedConnectionCleanupThread and default Druid pool settings generate these references, and presents configuration tweaks, JVM flags, and cleanup tasks that reduced GC pause times from 1.25 s to 0.1 s.

JD Tech Talk
JD Tech Talk
JD Tech Talk
Why MySQL getUiToken Times Out: PhantomReference GC Bottleneck & Fix

Background

Online service frequently raised timeout alerts (1 s) for the getUiToken API, with 4037 occurrences of error code “-1”. The failure rate was 0 % but response times showed long tails (TP99 = 1024 ms, max = 1280 ms).

Environment Information

Server: Linux 4‑core 8 GB

JVM parameters:

-server -Djava.library.path=/usr/local/lib -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/export/log -Djava.awt.headless=true -Dsun.net.client.defaultConnectTimeout=60000 -Dsun.net.client.defaultReadTimeout=60000 -Djmagick.systemclassloader=no -Dnetworkaddress.cache.ttl=300 -Dsun.net.inetaddr.ttl=300 -Xms5G -Xmx5G -XX:+UseG1GC -XX:G1HeapRegionSize=4m -Xloggc:/export/log/${APP_NAME}/gc_detail.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=10m -XX:MaxTenuringThreshold=15 -XX:+PrintTenuringDistribution -XX:+PrintHeapAtGC

Interface traffic:

Problem Investigation

Code analysis showed getUiToken only reads memory data, so the delay is not from business logic. Monitoring with SGM revealed that most of the latency occurs while waiting, suggesting GC pauses.

GC logs indicated a high number of Young GC events (4227) with occasional long pauses (max 1.25 s). Detailed GC logs showed that Ref‑Proc phase spent ~977 ms processing Ref objects.

Ref‑proc handles soft, weak, phantom, final, JNI references.

Further inspection identified many PhantomReference objects, especially ConnectionPhantomReference created by MySQL’s driver.

public class NonRegisteringDriver implements java.sql.Driver {
    protected static final ConcurrentHashMap<ConnectionPhantomReference, ConnectionPhantomReference> connectionPhantomRefs = new ConcurrentHashMap<>();
    protected static final ReferenceQueue<ConnectionImpl> refQueue = new ReferenceQueue<>();
    protected static void trackConnection(Connection newConn) {
        ConnectionPhantomReference phantomRef = new ConnectionPhantomReference((ConnectionImpl) newConn, refQueue);
        connectionPhantomRefs.put(phantomRef, phantomRef);
    }
}

The default Druid datasource configuration creates a new connection each time, adding entries to connectionPhantomRefs. Because the pool’s idle timeout is 30 minutes and minIdle is 0, connections are continuously created, inflating the phantom reference set and lengthening Ref‑Proc.

Solutions

Adjust Druid pool settings: increase minimum idle connections and connection lifetime (e.g., minIdle = 4, maxActive = 10, minEvictableIdleTimeMillis ≈ 3540000 ms) and extend MySQL wait_timeout to 59 minutes.

Enable parallel reference processing: add JVM flag -XX:+ParallelRefProcEnabled.

Periodically clean the phantom reference map when its size exceeds a threshold:

@Component
public class CleanPhantomRefsSchedule {
    private static final ScheduledExecutorService CLEANER_EXECUTOR =
        Executors.newSingleThreadScheduledExecutor(r -> {
            Thread t = new Thread(r, "mysql-phantom-ref-cleaner");
            t.setDaemon(true);
            return t;
        });

    @PostConstruct
    public void doTask() {
        try {
            Field field = NonRegisteringDriver.class.getDeclaredField("connectionPhantomRefs");
            field.setAccessible(true);
            CLEANER_EXECUTOR.scheduleAtFixedRate(() -> {
                try {
                    Map<?,?> refs = (Map<?,?>) field.get(null);
                    if (refs != null && refs.size() > 500) {
                        refs.clear();
                        log.info("Cleared MySQL phantom references (count={})", refs.size());
                    }
                } catch (Exception e) {
                    log.error("connectionPhantomRefs clear error!", e);
                }
            }, 1, 1, TimeUnit.HOURS);
        } catch (NoSuchFieldException e) {
            throw new IllegalStateException("Failed to initialize MySQL phantom refs field", e);
        }
    }

    @PreDestroy
    void shutdown() {
        CLEANER_EXECUTOR.shutdownNow();
    }
}

Upgrade mysql-connector-java to 8.x or disable phantom reference generation with -Dcom.mysql.cj.disableAbandonedConnectionCleanup=true.

Verification of Optimization

After applying the changes and redeploying, GC pause max dropped from 1.25 s to 0.1 s, Young GC frequency improved from every ~20 s to every 6 minutes, and PhantomReference processing time fell to 0.0001966 s. No further timeout alerts were observed, confirming the effectiveness of the optimizations.

JavaPerformance TuningMySQLgcDruidPhantomReference
JD Tech Talk
Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.