How to Fix: cranelift-interpreter: `load`/`store` does not trap when accessing across stack slots

Updated June 9, 2026 6 min read

Aldawsari

6 min read

Cranelift Interpreter Bug Fix: Make load/store Trap on Cross-Stack-Slot Access

The bug is subtle but dangerous: the cranelift-interpreter allowed a memory access to start in one stack slot and continue into another without trapping. That behavior breaks the intended isolation model for stack slots and can make the interpreter accept programs that should fail.

Table of Contents

What the Bug Looks Like
Understanding the Root Cause
Step-by-Step Solution
Common Edge Cases
FAQ

What the Bug Looks Like

In the failing scenario, a load or store uses an address near the end of one stack slot, but the access size extends past that slot’s boundary. Instead of rejecting the access, the interpreter reads or writes bytes as if adjacent stack slots formed one contiguous valid region.

That is incorrect because stack slots are logical memory objects, not just offsets in a flat byte array. Even if two slots appear adjacent in one layout, the interpreter must not permit an access that crosses from one slot into another. The issue description hints at the key reason: stack slots can be reordered. Once reordering is allowed, assuming cross-slot adjacency is semantically invalid.

Understanding the Root Cause

The root cause is usually an implementation that validates stack memory access against a broad backing allocation instead of against the specific stack slot referenced by the computed address.

In practice, the buggy logic often looks like this:

Resolve a stack address to some offset in interpreter-managed stack memory.
Check that the access stays within total allocated stack storage.
Perform the read or write.

The missing check is object-level bounds enforcement. A correct interpreter must answer two separate questions:

Does this address refer to a valid stack slot?
Does the full access range, from start offset to start offset plus access size, stay entirely inside that same slot?

If the interpreter only verifies overall memory bounds, a 4-byte load starting at byte 6 of an 8-byte slot may incorrectly consume 2 bytes from the next slot. That violates the intended semantics and prevents the expected trap.

This matters beyond correctness of one test. Interpreters are often used as a reference implementation for validation, testing, and debugging. If the interpreter is more permissive than the real semantics, it can hide compiler bugs and produce misleading test results.

Step-by-Step Solution

The fix is to make every stack-based load and store perform per-slot bounds checking before touching memory.

1. Represent stack slots as isolated regions

Make sure the interpreter can map an address to:

the target stack slot identifier
the offset within that slot
the slot’s declared size

If your current implementation stores stack memory in one flat buffer, keep that if needed for storage efficiency, but do not use flat-buffer bounds alone for safety decisions.

// Conceptual structure, not exact Cranelift source code structurestruct StackSlotData {    bytes: Vec<u8>,    size: u32,}

2. Resolve the address to exactly one slot

When evaluating a stack address, determine which stack slot it belongs to. If the base points to a slot plus an offset, compute:

slot = resolve_stack_slot(addr_base)offset_in_slot = addr_offset

If the address cannot be associated with a valid slot, trap immediately.

if slot.is_none() {    return Err(Trap::MemoryOutOfBounds);}

3. Check the full access width against the slot boundary

This is the critical fix. For a load or store of size access_size, verify:

let end = offset_in_slot.checked_add(access_size)    .ok_or(Trap::MemoryOutOfBounds)?;if end > slot.size {    return Err(Trap::MemoryOutOfBounds);}

This ensures an access starting near the end of a slot traps instead of spilling into the next slot.

4. Use the validated range for the actual read or write

Only after the per-slot range check passes should the interpreter read or mutate memory.

let start = offset_in_slot as usize;let end = start + access_size as usize;let data = &slot.bytes[start..end];

For stores:

let start = offset_in_slot as usize;let end = start + access_size as usize;slot.bytes[start..end].copy_from_slice(value_bytes);

5. Update both load and store paths

Do not patch only one opcode path. This bug affects both reads and writes. If the interpreter has separate implementations for scalar loads, vector loads, scalar stores, and vector stores, apply the same slot-bounded logic consistently.

6. Add regression tests that prove trapping behavior

Create tests where:

an access is fully inside a slot and should succeed
an access ends exactly at the slot boundary and should succeed
an access extends by 1 byte past the boundary and should trap
an access starts in one slot and would continue into another and should trap

// Pseudocode for regression intentslot0: size = 8slot1: size = 8load.i32 from stack_addr(slot0, 4)   // valid: bytes 4..8load.i32 from stack_addr(slot0, 5)   // trap: bytes 5..9 cross boundarystore.i64 to stack_addr(slot0, 0)    // valid: bytes 0..8store.i64 to stack_addr(slot0, 1)    // trap: bytes 1..9 cross boundary

7. Prefer a shared helper to avoid future divergence

The cleanest implementation is to centralize the check in one helper used by all stack-memory operations.

fn checked_stack_slot_range(    slot: &StackSlotData,    offset: u32,    access_size: u32,) -> Result<std::ops::Range<usize>, Trap> {    let end = offset.checked_add(access_size)        .ok_or(Trap::MemoryOutOfBounds)?;    if end > slot.size {        return Err(Trap::MemoryOutOfBounds);    }    Ok(offset as usize..end as usize)}

Then each load/store becomes simpler and less error-prone:

let range = checked_stack_slot_range(slot, offset_in_slot, access_size)?;let bytes = &slot.bytes[range];

8. Verify semantics against stack-slot reordering assumptions

The issue exists partly because physical adjacency must not imply semantic adjacency. After implementing the fix, review any code that:

iterates over all stack slots as one contiguous block
computes stack addresses using absolute offsets only
treats neighboring slots as mergeable memory

If such logic exists elsewhere, it may have the same bug pattern.

Common Edge Cases

Even after adding basic bounds checks, several edge cases can still cause incorrect behavior if overlooked.

Zero-sized or unusual access widths

If the IR or interpreter supports unusual memory widths, ensure the check handles them consistently. A zero-sized access may be invalid by construction or may need explicit handling depending on existing semantics.

Integer overflow during offset calculation

Never compute offset + size without checked arithmetic. Overflow can wrap around and accidentally pass a bounds check.

Negative offsets encoded through intermediate arithmetic

If address computation allows signed displacement before normalization, validate that the final offset cannot underflow into a large unsigned value.

Different code paths for vector and scalar memory operations

It is common for interpreters to implement vector loads/stores separately. If only scalar operations are fixed, cross-slot accesses may still slip through elsewhere.

Stack-address aliases introduced by helper abstractions

If a helper converts stack references into generic memory addresses too early, slot identity can be lost. Keep slot identity attached until after bounds validation.

Exact-boundary accesses

An access ending exactly at slot.size should succeed. The condition must be end > slot.size, not end >= slot.size.

FAQ

Why should crossing into another stack slot trap if the bytes are physically adjacent?

Because stack slots are separate logical objects. Their placement may change due to reordering, packing, or target-specific layout decisions. The interpreter must preserve object boundaries, not rely on incidental adjacency.

Is checking against the total stack allocation ever sufficient?

No, not for stack-slot accesses. Total allocation bounds only prove the memory exists somewhere in the interpreter. They do not prove the access stays within the intended slot.

Should this same rule apply to both load and store instructions?

Yes. A cross-slot load reads invalid bytes, and a cross-slot store corrupts neighboring slot state. Both must trap using the same slot-bounded validation logic.

Final Takeaway

The correct fix is not just “add a bounds check,” but “add the right bounds check”: validate every stack access against the boundaries of the specific stack slot it targets. Once that rule is enforced consistently across all load/store paths, the interpreter will correctly trap on cross-stack-slot access and align with expected Cranelift semantics.