Q34 of 40 · Core Java

Walk through how you'd diagnose a memory leak in a long-running Java service.

Core JavaSeniormemory-leakjvmheap-dumpprofilingoomdebugging

Short answer

Short answer: Diagnosing a Java memory leak follows a five-step process: observe the symptom (heap grows monotonically, OOM eventually), capture a heap dump, analyse with a profiler to find the largest retained object graphs, trace the root GC reference chain to identify who is keeping objects alive, then fix the retention and verify the fix with a long-running soak test.

Detail

Step 1 — Confirm it's a leak, not a sizing issue

# Watch heap usage over time
jstat -gcutil <pid> 5000  # every 5 s: OldGen % should plateau, not climb

If OldGen climbs past 80% and GC runs more frequently without reclaiming space, it's a leak.

Step 2 — Trigger a heap dump

# While running (safe on modern JVMs)
jmap -dump:format=b,file=/tmp/heap.hprof <pid>

# Or configure JVM to auto-dump on OOM
# -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/

Step 3 — Analyse the dump Open in Eclipse MAT or VisualVM. Use:

  • Dominator tree — shows which objects retain the most memory
  • Leak suspects report — MAT heuristic for unusual retention
Common leak patterns in QA automation:
Root cause                      Retained class
WebDriver not quit()            ChromeDriver / RemoteDriver
Static Map / List grow          HashMap, ArrayList
Listener not removed            EventListenerList
ThreadLocal not removed         ThreadLocalMap.Entry
Open streams not closed         BufferedInputStream
Reflection class generation     sun.reflect.GeneratedProxy

Step 4 — Find the GC root path In MAT: right-click the dominant object → "Path to GC Roots" → exclude phantom/weak refs. This shows the exact field chain keeping the object alive.

Step 5 — Fix and soak test

import org.openqa.selenium.WebDriver;

// Typical fix: always quit in teardown
// @AfterEach
void tearDown() {
    if (driver != null) {
        driver.quit();  // releases browser process and GC root
        driver = null;
    }
}

// ThreadLocal fix: always remove after use
private static final ThreadLocal<String> userContext = new ThreadLocal<>();

try {
    userContext.set(userId);
    runTest();
} finally {
    userContext.remove();  // prevents leak in thread pool (threads are reused)
}

Run a soak test for several hours. OldGen plateau = leak fixed; still climbing = more roots to find.

// WHAT INTERVIEWERS LOOK FOR

A structured diagnostic process, not guessing. Knowledge of jstat, jmap, heap dump analysis with MAT. Identifying common leak patterns (static collections, ThreadLocal misuse, unclosed resources). QA-specific candidates impress by mentioning WebDriver lifecycle as a known leak vector.

// COMMON PITFALL

Only increasing -Xmx instead of diagnosing. Adding heap space defers the OOM but doesn't fix the leak — the service will eventually OOM again, just later.