Debugging & Performance

Comparing tracemalloc Snapshots to Locate Growth

You have two tracemalloc snapshots taken at different points in a process's life and need to know precisely which lines retained more memory in the second. Printing each snapshot's top statistics side by side is unreadable. Snapshot.compare_to computes the per-line delta for you, returning StatisticDiff objects you can sort by size_diff to put the growing lines at the top.

Prerequisites

  • Python 3.4+ for compare_to; the StatisticDiff.size_diff / count_diff fields documented here are stable from 3.6.
  • Two snapshots taken with the same nframe, captured per memory profiling with tracemalloc.

Solution

second.compare_to(first, key_type) diffs the two snapshots grouped by 'lineno' or 'traceback' and returns a list of StatisticDiff. Each entry exposes size (current bytes), size_diff (byte change), count (current blocks), and count_diff (block change).

Python
import tracemalloc

tracemalloc.start(25)

before = tracemalloc.take_snapshot()        # baseline

buckets = []
for i in range(20_000):
    buckets.append(bytes(256))              # steady growth: 20k retained blocks

after = tracemalloc.take_snapshot()         # after the workload

# Diff grouped by source line; default ordering is by ABSOLUTE size_diff.
diff = after.compare_to(before, "lineno")

# Re-sort for pure growth so reclaimed lines do not surface at the top.
diff.sort(key=lambda stat: stat.size_diff, reverse=True)

for stat in diff[:5]:
    print(
        f"{stat.size_diff/1024:+9.1f} KiB  "      # byte change (+ = growth)
        f"blocks {stat.count_diff:+7d}  "          # block change
        f"{stat.traceback[0]}"                     # the source line
    )
Plain text
 +5000.0 KiB  blocks  +20000  compare.py:11
    +1.4 KiB  blocks     +12  compare.py:8

Line 11 — the buckets.append(bytes(256)) call — grew by ~5 MiB and exactly 20,000 blocks. When the same line is reached from several places, switch the key to 'traceback' so each call path is a separate entry, then format the winner's full stack:

Python
diff_tb = after.compare_to(before, "traceback")
diff_tb.sort(key=lambda stat: stat.size_diff, reverse=True)
print("\n".join(diff_tb[0].traceback.format()))   # full path to the growing line

To see what was reclaimed instead, sort ascending: the most negative size_diff entries are the lines that freed the most memory between snapshots — useful for confirming that a fix actually released the objects you expected.

Python
for stat in sorted(diff, key=lambda s: s.size_diff)[:3]:
    if stat.size_diff < 0:
        print(f"freed {(-stat.size_diff)/1024:.1f} KiB at {stat.traceback[0]}")

Why this works

compare_to keys every allocation group in both snapshots and computes size_diff = after.size - before.size per key, so a group present only in the later snapshot shows its full size as growth, and a freed group shows a negative diff. The default sort is by absolute size_diff, which deliberately surfaces the biggest change in either direction; re-sorting by signed size_diff separates growth from reclamation. Matching nframe between the two snapshots is required because the grouping key for 'traceback' is the frame tuple — mismatched depths produce keys that never line up.

Edge cases and failure modes

  • Mismatched nframe: comparing a 1-frame snapshot with a 25-frame snapshot under 'traceback' grouping yields meaningless diffs because the keys differ; capture both at the same depth.
  • Absolute-sort surprise: forgetting that the default sort is by absolute value lets a large reclaimed line outrank a real leak; always re-sort by signed size_diff when hunting growth.
  • count_diff vs size_diff divergence: growth in size_diff with flat count_diff means objects got bigger, not more numerous — a different bug class than an unbounded collection.
  • Cumulative parameter: compare_to(old, key, cumulative=True) aggregates over every frame in the traceback rather than just the leaf, which can attribute growth to a high-level caller; use it deliberately, not by default.
  • Snapshots taken too close together capture transient buffers that net to noise; bracket a meaningful unit of work, and for the full leak workflow follow finding memory leaks with tracemalloc snapshots.

Frequently Asked Questions

What does Snapshot.compare_to return? It returns a list of StatisticDiff objects, one per group, each carrying size, size_diff, count, and count_diff. The list is sorted by absolute size_diff descending by default, so the lines with the largest memory change appear first.

How do I sort the diff by growth instead of absolute change?compare_to sorts by absolute size_diff, which mixes growth and shrinkage at the top. To rank pure growth, re-sort the returned list with key=lambda s: s.size_diff, reverse=True so the largest positive deltas lead.

Why are some size_diff values negative? A negative size_diff means that line retained less memory in the second snapshot than the first, because objects were freed between the two captures. Positive values are growth; negatives are reclamation.

← Back to Memory Profiling with tracemalloc