How to Fix: Call stack exhaustion found via CPython’s `test_marshal` test suite around October 19th/20th

Updated June 9, 2026 6 min read

Aldawsari

6 min read

Call stack exhaustion in CPython `test_marshal`: root cause, reproduction, and fix strategy

A call stack exhaustion failure discovered while running CPython's test_marshal against Wasmtime/WASI-style embeddings usually points to a mismatch between Python's deeply recursive serialization tests and the host runtime's available native stack. The failure is not typically caused by invalid marshal data alone; it emerges when recursive object traversal, deserialization, or compiler-generated nesting exceeds the stack budget exposed to the executing environment.

Table of Contents

Understanding the Root Cause
Step-by-Step Solution
Common Edge Cases
FAQ

Understanding the Root Cause

The key issue is that test_marshal deliberately exercises deeply nested Python objects and boundary conditions in CPython's marshal reader/writer implementation. In a native desktop build, the process stack is often large enough that these tests either complete or fail in a controlled way such as RecursionError. In a sandboxed runtime such as WebAssembly with WASI, the available stack can be much smaller, and recursive C or Rust call paths can exhaust it before CPython handles the condition safely.

Technically, the failure tends to happen because several layers stack on top of each other:

CPython's marshal implementation recursively walks nested containers or code object structures.
The embedding runtime may add its own frames for host calls, traps, or memory checks.
Debug builds and sanitizer builds often use more stack per frame.
A fixed WebAssembly stack budget can be significantly lower than a typical native process stack.

This means a test that is merely “deep” on Linux/macOS can become a hard stack overflow under Wasmtime. If the issue appeared around October 19th/20th during CI or fuzz-style regression testing, that timing often indicates either a runtime change, a compiler/codegen difference, or a newly exercised recursive test path rather than a random flaky failure.

In practical terms, this is usually one of three bugs:

A runtime configuration bug: the stack size for the guest is too small.
A defensive recursion bug: recursive code lacks an early depth guard.
A test-environment mismatch: CPython tests assume native-stack characteristics that are not true in the embedding target.

Step-by-Step Solution

The most reliable approach is to first reproduce the issue deterministically, then confirm whether the fix belongs in runtime configuration, recursion guarding, or test expectations.

1. Reproduce the failure with the exact test

Run only the marshal test first so you can isolate the stack behavior.

python -m test test_marshal -v

If you are running CPython inside a Wasmtime-based environment, use the same command path your CI uses. For example, if a wrapper launches the Python binary under the runtime, execute that exact wrapper instead of a native host Python.

2. Confirm it is really stack exhaustion

Look for signals such as trap messages, segmentation faults, stack overflow reports, or runtime aborts instead of a normal Python exception. If available, capture a backtrace.

# Example ideas, adapt to your environment
RUST_BACKTRACE=1 your-runtime ./python -m test test_marshal -v

# If native reproduction exists
ulimit -s
python -m test test_marshal -v

If the failing path crashes before Python raises RecursionError, the problem is below normal Python-level recursion handling.

3. Increase the guest stack size

If the target is Wasmtime or another WebAssembly runtime, increase the configured stack budget for the guest instance. The exact configuration depends on the embedding code, but conceptually it looks like this:

// Pseudocode: increase stack allocation for the guest/runtime config
let mut config = Config::new();
config.max_wasm_stack(2 * 1024 * 1024); // example value

let engine = Engine::new(&config)?;

If your embedding uses a CLI, check the runtime documentation for a flag or setting controlling stack size. Use a moderate increase first, then rerun test_marshal.

4. Add or verify recursion/depth guards in the marshal path

If increasing the stack only masks the problem, inspect the recursive implementation. The correct long-term behavior is often to fail safely and predictably before native stack exhaustion occurs.

/* Pseudocode for defensive depth tracking */
int read_object(State *st, int depth) {
    if (depth > MAX_MARSHAL_DEPTH) {
        PyErr_SetString(PyExc_ValueError, "maximum marshal depth exceeded");
        return -1;
    }

    switch (next_type(st)) {
        case TYPE_LIST:
            return read_list(st, depth + 1);
        case TYPE_TUPLE:
            return read_tuple(st, depth + 1);
        default:
            return read_leaf(st);
    }
}

This pattern is especially important when native stack limits are tighter than the assumptions of the original C implementation.

5. Compare native vs embedded behavior

Run the same CPython revision natively and in the target runtime. If native passes but embedded crashes, that strongly suggests an environment-specific stack budget problem rather than corrupt marshal logic.

# Native
./python -m test test_marshal -v

# Embedded/runtime target
your-runtime ./python -m test test_marshal -v

6. Reduce the test case to a minimal reproducer

Create a small script that builds deeply nested objects and marshals/unmarshals them. This makes validation much faster than rerunning the full test suite.

import marshal

obj = 0
for _ in range(100000):
    obj = [obj]

payload = marshal.dumps(obj)
marshal.loads(payload)

If this crashes the runtime, you have a focused reproducer for stack-sensitive recursion. Then lower the nesting until you find the threshold.

7. Decide the correct fix location

If a larger runtime stack fully resolves the issue and behavior matches native CPython, fix the runtime configuration.
If recursion can still crash at realistic depths, add depth guards in the recursive path.
If the platform intentionally has a very small stack, patch the test expectations or skip behavior for that target with a clear rationale.

8. Validate with the full relevant suite

After the change, rerun not only test_marshal but adjacent tests that exercise code objects, recursion, and serialization behavior.

python -m test test_marshal test_compile test_pickle -v

This helps catch regressions where a stack increase fixes one path but another recursive path still overflows.

Common Edge Cases

Debug builds consume more stack: a fix that passes in release mode may still fail in debug CI.
Sanitizers change stack usage: ASan/UBSan builds can require much larger stack headroom.
Different compilers produce different frame sizes: LLVM, clang, and target-specific optimizations can move the failure threshold.
Host recursion checks do not protect guest stack: Python-level recursion limits are not always enough when lower-level implementation frames dominate.
Skipping the test can hide a real safety bug: only skip when the platform constraint is intentional and documented.
Malformed marshal payloads: if the crash occurs on untrusted input, treat it with extra care because stack exhaustion on crafted data can become a security concern.

FAQ

Why does CPython raise `RecursionError` in some environments but crash here?

RecursionError protects Python recursion limits, but this issue often happens in native implementation frames before Python can raise that exception. In constrained runtimes, the underlying stack can run out first.

Is increasing the Wasm or runtime stack always the right fix?

No. It is the fastest way to confirm the diagnosis, but not always the best permanent fix. If recursive parsing lacks a proper depth bound, a larger stack only delays the crash.

Should this be treated as a security issue?

If untrusted marshal input can reliably trigger process termination, sandbox escape is not required for it to matter operationally. Review the project's security policy and escalate appropriately when availability or memory-safety impact is plausible.

The durable resolution is to combine a sane runtime stack configuration with explicit depth limiting in recursive marshal paths, then verify behavior against both native CPython and the constrained execution target.

Call stack exhaustion in CPython test_marshal: root cause, reproduction, and fix strategy