Comparing tracemalloc Snapshots to Locate Growth

You have two tracemalloc snapshots taken at different points in a process's life and need to know precisely which lines retained more memory in the second. Printing each snapshot's top statistics side by side is unreadable — you end up eyeballing two long tables and guessing which lines moved. Snapshot.compare_to computes the per-line delta for you, returning StatisticDiff objects that you can sort by size_diff to put the growing lines at the top. This is the diff primitive that underpins memory profiling with tracemalloc: capturing snapshots is easy, but attribution comes from the comparison.

Both snapshots feed compare_to; re-sorting by signed size_diff floats the real leak (compare.py:11, +5000 KiB / +20000 blocks) to the top, while the freed line (cache.py:5) sinks to a negative, reclaimed entry.

Prerequisites

Python 3.4+ for compare_to; the StatisticDiff.size_diff / count_diff fields documented here are stable from 3.6.
Two snapshots taken with the same nframe (the frame depth passed to tracemalloc.start(nframe)), captured around the workload you are investigating.
No third-party packages — tracemalloc is in the standard library.

Solution

second.compare_to(first, key_type) diffs the two snapshots grouped by 'lineno' or 'traceback' and returns a list of StatisticDiff. Each entry exposes size (current bytes), size_diff (byte change), count (current blocks), and count_diff (block change).

import tracemalloc

tracemalloc.start(25)

before = tracemalloc.take_snapshot()        # baseline

buckets = []
for i in range(20_000):
    buckets.append(bytes(256))              # steady growth: 20k retained blocks

after = tracemalloc.take_snapshot()         # after the workload

# Diff grouped by source line; default ordering is by ABSOLUTE size_diff.
diff = after.compare_to(before, "lineno")

# Re-sort for pure growth so reclaimed lines do not surface at the top.
diff.sort(key=lambda stat: stat.size_diff, reverse=True)

for stat in diff[:5]:
    print(
        f"{stat.size_diff/1024:+9.1f} KiB  "      # byte change (+ = growth)
        f"blocks {stat.count_diff:+7d}  "          # block change
        f"{stat.traceback[0]}"                     # the source line
    )

 +5000.0 KiB  blocks  +20000  compare.py:11
    +1.4 KiB  blocks     +12  compare.py:8

Line 11 — the buckets.append(bytes(256)) call — grew by ~5 MiB and exactly 20,000 blocks. When the same line is reached from several places, switch the key to 'traceback' so each call path is a separate entry, then format the winner's full stack:

diff_tb = after.compare_to(before, "traceback")
diff_tb.sort(key=lambda stat: stat.size_diff, reverse=True)
print("\n".join(diff_tb[0].traceback.format()))   # full path to the growing line

To see what was reclaimed instead, sort ascending: the most negative size_diff entries are the lines that freed the most memory between snapshots — useful for confirming that a fix actually released the objects you expected.

for stat in sorted(diff, key=lambda s: s.size_diff)[:3]:
    if stat.size_diff < 0:
        print(f"freed {(-stat.size_diff)/1024:.1f} KiB at {stat.traceback[0]}")

The grouping key decides what the diff can tell you, and the three keys answer three different questions.

Start at filename to narrow the subsystem, then lineno to name the site; traceback needs a deeper nframe and costs more to collect.

Why this works

compare_to keys every allocation group in both snapshots and computes size_diff = after.size - before.size per key, so a group present only in the later snapshot shows its full size as growth, and a freed group shows a negative diff. The default sort is by absolute size_diff, which deliberately surfaces the biggest change in either direction; re-sorting by signed size_diff separates growth from reclamation. Matching nframe between the two snapshots is required because the grouping key for 'traceback' is the frame tuple — mismatched depths produce keys that never line up.

Edge cases and failure modes

Mismatched nframe: comparing a 1-frame snapshot with a 25-frame snapshot under 'traceback' grouping yields meaningless diffs because the keys differ; capture both at the same depth.
Absolute-sort surprise: forgetting that the default sort is by absolute value lets a large reclaimed line outrank a real leak; always re-sort by signed size_diff when hunting growth.
count_diff vs size_diff divergence: growth in size_diff with flat count_diff means objects got bigger, not more numerous — a different bug class than an unbounded collection.
Cumulative parameter: compare_to(old, key, cumulative=True) aggregates over every frame in the traceback rather than just the leaf, which can attribute growth to a high-level caller; use it deliberately, not by default.
Snapshots taken too close together capture transient buffers that net to noise; bracket a meaningful unit of work, and for the full leak workflow follow finding memory leaks with tracemalloc snapshots.

Reading a diff without being misled

A compare_to table is easy to misread in three specific ways, and each one sends an investigation in the wrong direction.

Size versus count. Each row carries both size_diff and count_diff. A row with a large size and a count of one is a single big object — a loaded file, a dataframe, a decompressed payload — and is usually intentional. A row with a modest size and a count in the thousands is the leak signature: many small objects that were never released. Sorting by size alone hides the second case behind the first.

import gc
import tracemalloc

tracemalloc.start(5)
baseline = tracemalloc.take_snapshot()
run_workload(cycles=200)
gc.collect()
current = tracemalloc.take_snapshot()

# Sort by count to surface many-small-objects leaks that size sorting hides.
for stat in sorted(current.compare_to(baseline, "lineno"),
                   key=lambda s: s.count_diff, reverse=True)[:10]:
    print(f"{stat.count_diff:+7d} objects  {stat.size_diff / 1024:+9.1f} KiB  {stat}")

Noise from the measurement itself. Snapshots allocate, the comparison allocates, and the filters you apply allocate. tracemalloc excludes its own bookkeeping, but the lines of your monitoring code will appear in the diff. Filter them out explicitly rather than mentally, so the output stays readable:

current = current.filter_traces((
    tracemalloc.Filter(False, tracemalloc.__file__),        # the module itself
    tracemalloc.Filter(False, "<frozen importlib._bootstrap>"),
    tracemalloc.Filter(False, __file__),                    # this monitoring script
))

Import-time growth counted as leakage. Anything imported lazily during the measured window allocates module globals, compiled code objects and type objects that will never be freed. That is not a leak, but it dominates a first diff. Warm the code path once before taking the baseline — call the operation a handful of times, then snapshot — and the import cost disappears from the comparison entirely.

Two habits make diffs comparable over time. Take the same number of cycles between every pair of snapshots, so growth per cycle is directly comparable across runs, and record the cycle count alongside the diff. And keep the raw snapshots (snapshot.dump(path)) for anything you plan to revisit: a saved snapshot can be re-compared with a different grouping key or a different filter later, whereas a printed table can only be re-read.

Finally, confirm the fix the same way you found the bug. After the change, run the identical cycle count and compare against a saved baseline from before: the row that dominated the previous diff should be absent, and the total growth should be flat rather than merely smaller. A leak that grew by 40 MB and now grows by 4 MB is still a leak with a longer fuse.

Warming the path and filtering the monitoring code removes roughly nine tenths of a first diff, leaving the growth that matters.

Keep the comparison script in the repository next to the code it measures, rather than reconstructing it during each incident. A twenty-line module with a fixed cycle count, a fixed filter list and a saved baseline turns a memory investigation into one command, and it means two engineers measuring the same service six months apart produce numbers that can actually be compared.

Frequently Asked Questions

What does Snapshot.compare_to return? It returns a list of StatisticDiff objects, one per group, each carrying size, size_diff, count, and count_diff. The list is sorted by absolute size_diff descending by default, so the lines with the largest memory change appear first.

How do I sort the diff by growth instead of absolute change?compare_to sorts by absolute size_diff, which mixes growth and shrinkage at the top. To rank pure growth, re-sort the returned list with key=lambda s: s.size_diff, reverse=True so the largest positive deltas lead.

Why are some size_diff values negative? A negative size_diff means that line retained less memory in the second snapshot than the first, because objects were freed between the two captures. Positive values are growth; negatives are reclamation. Can I compare snapshots taken in different processes? No. A snapshot holds frame and filename data tied to the process that produced it, and compare_to assumes both sides describe the same run. To compare across processes — a canary against a control, for example — dump both snapshots, load each separately, and compare the per-line totals yourself as ordinary data. The statistics are comparable even though the snapshot objects are not.

Finding Memory Leaks with tracemalloc Snapshots — the full leak-hunting loop that repeats this diff across many iterations to separate real growth from warm-up noise.
Memory Profiling with tracemalloc — capturing, filtering, and grouping snapshots, plus the nframe and start() setup this page depends on.
Interpreting cProfile: Cumulative vs Total Time — the CPU analogue of self vs cumulative attribution, where a leaf-vs-caller distinction mirrors compare_to's cumulative flag.
Systematic Debugging & Performance Profiling — where snapshot diffing fits in the broader profiling and diagnosis workflow.

← Back to Memory Profiling with tracemalloc