Debugging & Performance

Finding Memory Leaks with tracemalloc Snapshots

A service's resident memory climbs steadily and never plateaus; restarts are the only mitigation. The leak is not a crash, so there is no traceback to follow — just a number going up. tracemalloc snapshots turn that into an exact line: bracket the suspect operation with two snapshots, diff them, and the line whose retained bytes grew with iteration count is your leak.

Prerequisites

Solution

The technique relies on the fact that a real leak grows linearly with iteration count while warm-up allocations (caches, interned strings, lazy imports) are one-time. Warm up first, snapshot a baseline, loop many times, snapshot again, then compare_to.

Python
import tracemalloc

# A classic leak: an unbounded module-level cache that nothing ever evicts.
_CACHE = {}

def handle_request(request_id):
    # Each call retains a 1 KiB payload keyed by id; keys are never removed.
    _CACHE[request_id] = bytes(1024)
    return _CACHE[request_id]


tracemalloc.start(25)                 # 25 frames so we can see the call path

handle_request(-1)                    # warm-up pass: absorb one-time allocations
baseline = tracemalloc.take_snapshot()

for i in range(10_000):               # loop the suspect operation many times
    handle_request(i)

after = tracemalloc.take_snapshot()

# Diff the two snapshots; size_diff is byte growth between them.
top = after.compare_to(baseline, "lineno")
for stat in top[:3]:
    print(f"+{stat.size_diff/1024:8.1f} KiB  count {stat.count_diff:>6}  {stat.traceback[0]}")
Plain text
+10240.0 KiB  count  10000  leak.py:9
+    1.2 KiB  count     31  leak.py:18

The first entry — line 9, the _CACHE[request_id] = bytes(1024) assignment — grew by ~10 MiB across 10,000 iterations with a matching count_diff of 10,000 blocks. That one-to-one growth between bytes and block count is the signature of a leak. To see who drove the allocation, switch the grouping to traceback and format the path:

Python
top_tb = after.compare_to(baseline, "traceback")
print("\n".join(top_tb[0].traceback.format()))   # full call stack to the leaking line

If the same leaking line is reached from many callers, the 'traceback' grouping separates them so you can tell which call site is unbounded — exactly the case where nframe=1 would hide the answer.

Why this works

A snapshot records the currently live tracked allocations. Anything freed between the two snapshots does not appear in the diff, so transient buffers cancel out and only retained growth survives. Because a leak retains a new block every iteration, its count_diff scales with the loop count while bounded structures stay flat. Grouping by lineno collapses all blocks from the offending line into a single ranked entry, and sorting by size_diff puts the worst offender first.

Edge cases and failure modes

  • Warm-up not excluded: skipping the baseline-after-warm-up step floods the diff with import and cache allocations that look like leaks but plateau — always warm up first.
  • GC-deferred frees: objects in reference cycles are not freed until gc runs; call gc.collect() before the second snapshot to avoid mistaking deferred frees for a leak.
  • Too few iterations: a small loop lets a one-time 5 MiB cache outrank a slow leak; loop enough that linear growth dominates.
  • C-extension memory: raw malloc in a native library is invisible; if tracemalloc shows nothing but RSS climbs, reach for memray or valgrind.
  • Per-test leaks vs per-process: a leak that only appears across a pytest session usually means a session-scoped fixture retains state — confirm with the scoping guidance in mastering pytest fixtures.

Frequently Asked Questions

How many times should I repeat the operation before the second snapshot? Repeat enough times that a genuine leak dwarfs one-time warm-up allocations, typically hundreds to thousands of iterations. A leak grows roughly linearly with iterations, while caches and interned objects plateau, which makes the leaking line obvious in the diff.

Why does the first run always show growth even with no leak? The first iterations allocate caches, compiled regexes, lazily imported modules, and interned strings that never free. Take the baseline snapshot after a warm-up pass so these one-time allocations are excluded from the comparison.

Can tracemalloc find leaks in C extensions? Only partially. tracemalloc sees allocations routed through Python's allocators, so objects a C extension creates via PyObject_Malloc are visible, but raw malloc outside the Python heap is not. Use memray or valgrind for native leaks.

← Back to Memory Profiling with tracemalloc