Mocking Network and HTTP Calls

A unit test that reaches a live HTTP endpoint is not a unit test; it is a flaky integration test wearing a disguise. It fails when the network blips, when the upstream rate-limits CI, or when a colleague's branch mutates shared staging data — and it fails non-deterministically, which is the most expensive kind of failure to debug. The symptom is a test that is green on your laptop and red on the third parallel CI shard. The fix is to intercept outbound calls at a well-chosen layer and replay deterministic responses, while still driving your real client so that retries, header handling, and JSON decoding execute exactly as they will in production. This guide maps the full spectrum from naive monkeypatch through the responses and respx interception libraries to a real fake server, and shows where each one is the right tool.

Prerequisites

python >= 3.9.
pytest >= 8.0.
HTTP clients: requests >= 2.31 and/or httpx >= 0.27.
Interception libraries, installed as needed: responses >= 0.25, respx >= 0.21, requests-mock >= 1.12.
Socket guards and fake servers: pytest-socket >= 0.7, pytest-httpserver >= 1.1.

pip install "pytest>=8.0" "requests>=2.31" "httpx>=0.27" \
            "responses>=0.25" "respx>=0.21" "requests-mock>=1.12" \
            "pytest-socket>=0.7" "pytest-httpserver>=1.1"

The interception techniques here build on the namespace rules in Patching Strategies for Complex Codebases and the spec discipline from Autospec & Strict Mocking — both matter the moment you stop hand-rolling fake responses.

Core concept

Every HTTP call descends through a stack: your application function calls a high-level client method (requests.get, httpx.Client.get), which builds a Request, hands it to a transport adapter (requests' HTTPAdapter over urllib3, httpx's HTTPTransport), which finally opens a socket. You can substitute a double at any layer, and the layer you choose determines both how realistic the test is and how brittle it is.

The interception layer is a fidelity-versus-brittleness trade-off: a fake server exercises the real transport, transport stubs keep your client code real while replaying canned bytes, and a hand-rolled client mock is fast but bypasses status, headers, and decoding.

The pattern to internalize: intercept as low in the stack as you can afford to, because every layer you skip is a layer of production behaviour the test no longer covers. A hand-rolled MagicMock that returns {"id": 1} never exercises status-code checks or raise_for_status(); a transport stub does.

Map the layer to the tool before you write a line of test code. requests and httpx are separate stacks with separate interception points, and a stub built for one is invisible to the other:

Interception point	`requests`	`httpx`	What still runs for real
Socket	`pytest-socket` (block)	`pytest-socket` (block)	everything above the socket, if allowed
Transport adapter	`responses`, `requests-mock`	`respx`, `MockTransport`	session, retries, redirects, headers, JSON decoding
Client method	`monkeypatch requests.get`	`monkeypatch httpx.Client.get`	only your call site
Return value	`MagicMock`	`MagicMock`	nothing — you fabricate the object

The transport-adapter row is the default because it keeps the largest amount of real client code executing while never opening a socket. The choice of double is not about ergonomics; it is about how much of production you are willing to leave untested. Because a double replaces one specific layer, mixing them — a responses registration under an httpx call, say — produces a stub that silently never matches, which socket-blocking (step 1) is what catches.

Step-by-step implementation

1. Block real sockets so mistakes fail loudly

Before adding a single stub, make unmocked calls impossible. pytest-socket disables socket creation suite-wide and raises SocketBlockedError on any escape.

# pyproject.toml
[tool.pytest.ini_options]
addopts = "--disable-socket --allow-hosts=127.0.0.1,::1"

With this in place a test that forgets to register a stub fails immediately with a clear message instead of hanging on a DNS timeout or — worse — quietly mutating a real service. The --allow-hosts whitelist keeps loopback open so a local fake server still works.

2. The naive baseline: monkeypatch the client method

The cheapest double replaces the client function and returns a real Response. Build a genuine requests.Response so status_code, .json(), and raise_for_status() all behave.

import json
import requests
from myapp.client import fetch_user  # calls requests.get(...).json()

def _make_response(payload: dict, status: int = 200) -> requests.Response:
    resp = requests.Response()
    resp.status_code = status
    # _content must be bytes; .json() decodes it through the real machinery.
    resp._content = json.dumps(payload).encode("utf-8")
    resp.headers["Content-Type"] = "application/json"
    return resp

def test_fetch_user_monkeypatch(monkeypatch):
    captured = {}
    def fake_get(url, **kwargs):
        captured["url"] = url            # capture so we can assert the call
        return _make_response({"id": 7, "name": "Ada"})
    # Patch where requests.get is LOOKED UP, not where it is defined.
    monkeypatch.setattr("myapp.client.requests.get", fake_get)

    user = fetch_user(7)

    assert user["name"] == "Ada"
    assert captured["url"].endswith("/users/7")

The hard part is the target: you patch myapp.client.requests.get, the name as the code under test resolves it, not requests.get. That namespace rule is the single most common cause of a green test over live code. The pytest monkeypatch fixture auto-reverts on teardown, so no manual cleanup is needed. This approach stops scaling the moment you need URL matching, multiple sequenced responses, or call assertions — that is where dedicated libraries earn their place.

3. The default tool for requests: the responses library

responses patches requests at the HTTPAdapter/urllib3 boundary, so your real Session, retry config, and JSON decoding all run. You declare expectations; it replays them and records calls.

import responses
import requests
from myapp.client import fetch_user

@responses.activate
def test_fetch_user_with_responses():
    responses.add(
        responses.GET,
        "https://api.example.com/users/7",
        json={"id": 7, "name": "Ada"},   # serialized + Content-Type set for you
        status=200,
    )
    user = fetch_user(7)
    assert user["name"] == "Ada"
    # Records every intercepted call for assertions.
    assert len(responses.calls) == 1
    assert responses.calls[0].request.headers["Accept"] == "application/json"

Where hand-rolled patches collapse is precise request matching. A stub that fires on any call to a host cannot prove your code sent the right query string, headers, or JSON body. responses ships matchers for exactly this:

import responses
from responses import matchers
from myapp.client import search_users

@responses.activate
def test_search_sends_correct_query():
    responses.add(
        responses.GET,
        "https://api.example.com/users",
        json={"results": []},
        match=[
            matchers.query_param_matcher({"q": "ada", "limit": "10"}),
            matchers.header_matcher({"Authorization": "Bearer test-token"}),
        ],
    )
    search_users("ada", limit=10)
    # If the code sends a different query or drops the auth header, no
    # registration matches and the request raises ConnectionError — the
    # test fails on the mismatch instead of passing on a loose stub.

Sequenced responses (for retry/backoff testing) are just repeated add calls to the same URL — the first matching unconsumed registration fires, in order, so a 500 followed by a 200 models a transient failure that a retrying client should recover from:

@responses.activate
def test_client_retries_on_500():
    url = "https://api.example.com/users/7"
    responses.add(responses.GET, url, status=500)          # first attempt fails
    responses.add(responses.GET, url, json={"id": 7}, status=200)  # retry succeeds
    user = fetch_user(7)                                    # client must retry
    assert user["id"] == 7
    assert len(responses.calls) == 2                        # proves a retry happened

For responses that depend on the request (echo an id back, increment a counter), register a callback with responses.add_callback that receives the PreparedRequest and returns a (status, headers, body) tuple. The full registry API, matchers, callbacks, and assert_all_requests_are_fired are covered in Mocking requests with the responses Library.

4. The httpx equivalent: respx

responses does not see httpx traffic. respx intercepts at httpx's transport layer and covers both sync httpx.Client and async httpx.AsyncClient with one API.

import httpx
import respx
import pytest

@respx.mock
def test_sync_httpx():
    route = respx.get("https://api.example.com/users/7").mock(
        return_value=httpx.Response(200, json={"id": 7, "name": "Ada"})
    )
    with httpx.Client() as client:
        resp = client.get("https://api.example.com/users/7")
    assert resp.json()["name"] == "Ada"
    assert route.called

@pytest.mark.asyncio
@respx.mock
async def test_async_httpx():
    respx.get("https://api.example.com/ping").mock(
        return_value=httpx.Response(204)
    )
    async with httpx.AsyncClient() as client:
        resp = await client.get("https://api.example.com/ping")
    assert resp.status_code == 204

Because respx returns real httpx.Response objects, streaming, raise_for_status(), and content decoding behave identically to production. respx also matches on path patterns and query parameters, and side_effect drives sequenced or request-dependent responses — the httpx analogue of the responses retry example above:

import httpx
import respx

@respx.mock
def test_httpx_retry_sequence():
    route = respx.get("https://api.example.com/users/7").mock(
        side_effect=[
            httpx.Response(503),                       # first call: unavailable
            httpx.Response(200, json={"id": 7}),       # retry: success
        ]
    )
    user = fetch_user_httpx(7)
    assert user["id"] == 7
    assert route.call_count == 2

A callable side_effect receives the httpx.Request, so you can assert on and echo the outbound body. When you deliberately want one host to reach the network while stubbing the rest, respx.route(...).pass_through() lets that traffic escape the mock. For async tests, mind fixture and loop scoping as described in scoping pytest fixtures for async tests; a route registered in a fixture on the wrong loop scope silently stops intercepting, and the awaitable-response mechanics behind async doubles are unpacked in the deep dive into unittest.mock.

5. The fixture-driven alternative: requests-mock

requests-mock covers the same ground as responses for the requests library but ships a first-class pytest fixture, which suits suites that prefer dependency injection over decorators.

def test_with_requests_mock(requests_mock):  # fixture from the requests-mock plugin
    requests_mock.get(
        "https://api.example.com/users/7",
        json={"id": 7, "name": "Ada"},
    )
    import requests
    resp = requests.get("https://api.example.com/users/7")
    assert resp.json()["id"] == 7
    assert requests_mock.call_count == 1

The choice between responses and requests-mock is largely stylistic: decorator/context-manager versus fixture. Pick one per codebase to avoid two overlapping registries fighting over the same adapter.

6. When to escalate to a fake server

Stubs replace the transport, so they cannot test the transport. When your code owns connection pooling, redirects, chunked streaming, or TLS behaviour — or when you want to assert the exact bytes on the wire — run a real local server.

def test_against_fake_server(httpserver):  # pytest-httpserver fixture
    httpserver.expect_request("/users/7").respond_with_json(
        {"id": 7, "name": "Ada"}
    )
    import requests
    # A REAL socket connects to 127.0.0.1 — keep loopback allowed in pytest-socket.
    resp = requests.get(httpserver.url_for("/users/7"))
    assert resp.json()["name"] == "Ada"

This exercises the full networking stack against 127.0.0.1, so it catches transport bugs a stub never would, at the cost of being slower and requiring the loopback allowance from step 1.

7. Inject timeouts and connection errors, not just status codes

The responses your resilience code most needs to survive are the ones a happy-path stub never produces: a socket that hangs until the client's timeout fires, a connection reset mid-transfer, a DNS failure. Both libraries let you raise a transport exception instead of returning a body, which is the only honest way to test retry/backoff and circuit-breaker logic.

import responses
import requests
from myapp.client import fetch_user

@responses.activate
def test_retries_then_gives_up_on_timeout():
    url = "https://api.example.com/users/7"
    # A registration whose body is an exception raises it at the transport
    # layer, exactly as urllib3 would on a real timeout.
    responses.add(responses.GET, url, body=requests.exceptions.ConnectTimeout())
    responses.add(responses.GET, url, body=requests.exceptions.ConnectTimeout())
    responses.add(responses.GET, url, json={"id": 7}, status=200)

    user = fetch_user(7)          # code should retry twice, then succeed
    assert user["id"] == 7
    assert len(responses.calls) == 3

Each responses.add registration is consumed once, first-registered first: the two ConnectTimeout bodies raise at the transport layer exactly as urllib3 would, the client backs off and retries after each, and the third registration returns the 200. Asserting len(responses.calls) == 3 proves both retries actually happened.

The respx equivalent passes an exception instance as the side_effect: respx.get(url).mock(side_effect=httpx.ConnectTimeout("timed out")). Assert the exact exception type your code catches — swallowing httpx.TimeoutException while your handler only catches httpx.ConnectError is a bug these tests are designed to expose. Note that a true wall-clock timeout (the client waiting N seconds) is a transport concern; if you must verify the client's timeout value is honoured, that belongs against the fake server from step 6, not a stub.

8. Record and replay when the payloads are too large to hand-write

When a response body is a 400-line JSON document from a real upstream, hand-registering it is tedious and drifts out of date. Record-and-replay tools such as vcrpy (and its pytest wrapper pytest-recording) capture real traffic once into a YAML "cassette", then replay it deterministically on every subsequent run with no network access.

import pytest
import requests

@pytest.mark.vcr()   # records to cassettes/<test-name>.yaml on first run
def test_fetch_large_catalog():
    resp = requests.get("https://api.example.com/catalog")
    assert len(resp.json()["items"]) == 128

Treat cassettes as fixtures you review, not magic: scrub Authorization headers and secrets with filter_headers before committing, set record_mode="none" in CI so a missing cassette fails instead of silently re-recording against production, and re-record deliberately when the contract changes. Recording captures the wire format faithfully but couples your suite to a snapshot; reach for it when fidelity of a large payload matters more than the explicitness of a hand-written stub.

Verification

Confirm the suite is genuinely isolated and your stubs are tight:

Run pytest --disable-socket and watch for SocketBlockedError; a clean pass proves no test reaches the network.
In responses, set assert_all_requests_are_fired=True (the default for the context-manager form) so a registered-but-unused stub fails the test rather than rotting silently.
In respx, assert route.called and route.call_count on each route; use assert_all_called to catch dead routes.
Diff len(responses.calls) (or requests_mock.call_count) against the expected number to detect accidental retries or duplicate requests.
Run the suite under pytest -p no:randomly-off ordering or with pytest-randomly enabled to confirm no stub leaks across tests.

Troubleshooting

Symptom	Root cause	Fix
Test passes but production hits the network	Patched `requests.get` instead of the consuming module's lookup name	Patch `mypkg.module.requests.get`, the name where the call site resolves it
`ConnectionError` / stub never matches	URL, method, or query string differs (trailing slash, `?` params)	Match the exact URL or add a query-string matcher; print `responses.calls` to see the real request
`responses` ignores httpx traffic	`responses` only patches `requests`/`urllib3`	Use `respx` (or httpx `MockTransport`) for httpx clients
`SocketBlockedError` on a legitimate local server	`pytest-socket` blocks loopback too	Add `--allow-hosts=127.0.0.1,::1` for fake-server tests
Stub fires for the wrong test	Decorator/fixture scope leaked the registry	Use per-test `@responses.activate` / `@respx.mock`; never register at module import time
`AssertionError: not all requests fired`	A registered response was never requested	Remove the dead stub or assert the code path that should consume it
Retry test never retries	Only one response registered, or the injected error is a type your handler does not catch	Register the failure(s) then the success; raise the exact exception (`ConnectTimeout`, `ConnectError`) your code catches
`vcr` re-records against production in CI	`record_mode` defaults to recording when a cassette is missing	Set `record_mode="none"` in CI so a missing cassette fails loudly
Secrets leak into a committed cassette	Recorded headers/bodies stored verbatim	Scrub with `filter_headers`/`filter_query_parameters` before the first commit

The interception point decides what the test still exercises - everything below the cut is replaced, everything above it runs for real.

Cut as low as the test allows: intercepting at the adapter keeps URL building, headers and serialisation under test.

Frequently Asked Questions

Should I patch requests.get or use a library like responses? Patch a single call site only for a one-off; use responses or respx as soon as a test exercises real client behaviour like retries, sessions, or query-string matching. Hand-rolled patches return naked dicts that skip status codes, headers, and JSON decoding, so they pass while production parsing breaks.

How do I stop a test suite from making real network calls by accident? Install pytest-socket and run with --disable-socket, then allow loopback with --allow-hosts=127.0.0.1 for tests that use a local fake server. Any unmocked outbound call then raises SocketBlockedError instead of silently hitting production.

When should I run a fake HTTP server instead of mocking the client? Use a fake server such as pytest-httpserver when the code under test owns the transport (custom connection pooling, TLS, redirects, streaming) or when you integration-test the wire format. Mock the client when you only care that your code reacts correctly to a given response body or status.

Do responses and respx work for async httpx clients?respx patches both httpx.Client and httpx.AsyncClient through httpx's transport layer, so it covers sync and async in one API. responses targets requests and does not intercept httpx; use respx or httpx's own MockTransport for httpx code.

The companion deep dive on mocking requests with the responses library covers registries, matchers, and assert_all_requests_are_fired in full.
Getting the patch target right is the recurring failure here; patching strategies for complex codebases explains namespace resolution end to end.
When you build the response objects by hand, autospec and strict mocking keeps those doubles honest against the real client signatures.
For the mechanics of MagicMock, AsyncMock, and awaitable responses behind async HTTP doubles, see the deep dive into unittest.mock.
When you want randomized payloads and URLs rather than fixed fixtures, drive the stubs with property-based and fuzz testing strategies.
Mocking httpx async clients with respx — transport-level interception that still exercises URL building and headers.
Recording and replaying HTTP with VCR.py — capture the real payload once, redact it, and replay it deterministically.

← Back to Advanced Mocking & Test Doubles in Python