How to Fix: cranelift-interpreter: `load`/`store` does not trap when accessing across stack slots
Cranelift Interpreter Bug Fix: Make load/store Trap on Cross-Stack-Slot Access
The bug is subtle but dangerous: the cranelift-interpreter allowed a memory access to start in one stack slot and continue into another without trapping. That behavior breaks the intended isolation model for stack slots and can make the interpreter accept programs that should fail.
What the Bug Looks Like
In the failing scenario, a load or store uses an address near the end of one stack slot, but the access size extends past that slot’s boundary. Instead of rejecting the access, the interpreter reads or writes bytes as if adjacent stack slots formed one contiguous valid region.
That is incorrect because stack slots are logical memory objects, not just offsets in a flat byte array. Even if two slots appear adjacent in one layout, the interpreter must not permit an access that crosses from one slot into another. The issue description hints at the key reason: stack slots can be reordered. Once reordering is allowed, assuming cross-slot adjacency is semantically invalid.
Understanding the Root Cause
The root cause is usually an implementation that validates stack memory access against a broad backing allocation instead of against the specific stack slot referenced by the computed address.
In practice, the buggy logic often looks like this:
- Resolve a stack address to some offset in interpreter-managed stack memory.
- Check that the access stays within total allocated stack storage.
- Perform the read or write.
The missing check is object-level bounds enforcement. A correct interpreter must answer two separate questions:
- Does this address refer to a valid stack slot?
- Does the full access range, from start offset to start offset plus access size, stay entirely inside that same slot?
If the interpreter only verifies overall memory bounds, a 4-byte load starting at byte 6 of an 8-byte slot may incorrectly consume 2 bytes from the next slot. That violates the intended semantics and prevents the expected trap.
This matters beyond correctness of one test. Interpreters are often used as a reference implementation for validation, testing, and debugging. If the interpreter is more permissive than the real semantics, it can hide compiler bugs and produce misleading test results.
Step-by-Step Solution
The fix is to make every stack-based load and store perform per-slot bounds checking before touching memory.
1. Represent stack slots as isolated regions
Make sure the interpreter can map an address to:
- the target stack slot identifier
- the offset within that slot
- the slot’s declared size
If your current implementation stores stack memory in one flat buffer, keep that if needed for storage efficiency, but do not use flat-buffer bounds alone for safety decisions.
// Conceptual structure, not exact Cranelift source code structurestruct StackSlotData { bytes: Vec<u8>, size: u32,}
2. Resolve the address to exactly one slot
When evaluating a stack address, determine which stack slot it belongs to. If the base points to a slot plus an offset, compute:
slot = resolve_stack_slot(addr_base)offset_in_slot = addr_offset
If the address cannot be associated with a valid slot, trap immediately.
if slot.is_none() { return Err(Trap::MemoryOutOfBounds);}
3. Check the full access width against the slot boundary
This is the critical fix. For a load or store of size access_size, verify:
let end = offset_in_slot.checked_add(access_size) .ok_or(Trap::MemoryOutOfBounds)?;if end > slot.size { return Err(Trap::MemoryOutOfBounds);}
This ensures an access starting near the end of a slot traps instead of spilling into the next slot.
4. Use the validated range for the actual read or write
Only after the per-slot range check passes should the interpreter read or mutate memory.
let start = offset_in_slot as usize;let end = start + access_size as usize;let data = &slot.bytes[start..end];
For stores:
let start = offset_in_slot as usize;let end = start + access_size as usize;slot.bytes[start..end].copy_from_slice(value_bytes);
5. Update both load and store paths
Do not patch only one opcode path. This bug affects both reads and writes. If the interpreter has separate implementations for scalar loads, vector loads, scalar stores, and vector stores, apply the same slot-bounded logic consistently.
6. Add regression tests that prove trapping behavior
Create tests where:
- an access is fully inside a slot and should succeed
- an access ends exactly at the slot boundary and should succeed
- an access extends by 1 byte past the boundary and should trap
- an access starts in one slot and would continue into another and should trap
// Pseudocode for regression intentslot0: size = 8slot1: size = 8load.i32 from stack_addr(slot0, 4) // valid: bytes 4..8load.i32 from stack_addr(slot0, 5) // trap: bytes 5..9 cross boundarystore.i64 to stack_addr(slot0, 0) // valid: bytes 0..8store.i64 to stack_addr(slot0, 1) // trap: bytes 1..9 cross boundary
7. Prefer a shared helper to avoid future divergence
The cleanest implementation is to centralize the check in one helper used by all stack-memory operations.
fn checked_stack_slot_range( slot: &StackSlotData, offset: u32, access_size: u32,) -> Result<std::ops::Range<usize>, Trap> { let end = offset.checked_add(access_size) .ok_or(Trap::MemoryOutOfBounds)?; if end > slot.size { return Err(Trap::MemoryOutOfBounds); } Ok(offset as usize..end as usize)}
Then each load/store becomes simpler and less error-prone:
let range = checked_stack_slot_range(slot, offset_in_slot, access_size)?;let bytes = &slot.bytes[range];
8. Verify semantics against stack-slot reordering assumptions
The issue exists partly because physical adjacency must not imply semantic adjacency. After implementing the fix, review any code that:
- iterates over all stack slots as one contiguous block
- computes stack addresses using absolute offsets only
- treats neighboring slots as mergeable memory
If such logic exists elsewhere, it may have the same bug pattern.
Common Edge Cases
Even after adding basic bounds checks, several edge cases can still cause incorrect behavior if overlooked.
Zero-sized or unusual access widths
If the IR or interpreter supports unusual memory widths, ensure the check handles them consistently. A zero-sized access may be invalid by construction or may need explicit handling depending on existing semantics.
Integer overflow during offset calculation
Never compute offset + size without checked arithmetic. Overflow can wrap around and accidentally pass a bounds check.
Negative offsets encoded through intermediate arithmetic
If address computation allows signed displacement before normalization, validate that the final offset cannot underflow into a large unsigned value.
Different code paths for vector and scalar memory operations
It is common for interpreters to implement vector loads/stores separately. If only scalar operations are fixed, cross-slot accesses may still slip through elsewhere.
Stack-address aliases introduced by helper abstractions
If a helper converts stack references into generic memory addresses too early, slot identity can be lost. Keep slot identity attached until after bounds validation.
Exact-boundary accesses
An access ending exactly at slot.size should succeed. The condition must be end > slot.size, not end >= slot.size.
FAQ
Why should crossing into another stack slot trap if the bytes are physically adjacent?
Because stack slots are separate logical objects. Their placement may change due to reordering, packing, or target-specific layout decisions. The interpreter must preserve object boundaries, not rely on incidental adjacency.
Is checking against the total stack allocation ever sufficient?
No, not for stack-slot accesses. Total allocation bounds only prove the memory exists somewhere in the interpreter. They do not prove the access stays within the intended slot.
Should this same rule apply to both load and store instructions?
Yes. A cross-slot load reads invalid bytes, and a cross-slot store corrupts neighboring slot state. Both must trap using the same slot-bounded validation logic.
Final Takeaway
The correct fix is not just “add a bounds check,” but “add the right bounds check”: validate every stack access against the boundaries of the specific stack slot it targets. Once that rule is enforced consistently across all load/store paths, the interpreter will correctly trap on cross-stack-slot access and align with expected Cranelift semantics.