How to Fix: riscv64: Bus error on qemu when executing `atomic_cas.i64`
A bus error on riscv64 under QEMU while executing atomic_cas.i64 is a classic sign that generated code is issuing an atomic memory operation against an address that is not correctly aligned for a 64-bit access. In this case, the crash is not just “a QEMU quirk”; it usually points to a mismatch between the compiler backend’s assumptions about atomic legality and the actual alignment guarantees of the target memory operand.
The practical fix is to ensure that atomic compare-and-swap for i64 on riscv64 is only lowered to native atomic instructions when the effective address is 8-byte aligned, or to route the operation through a safe fallback path when alignment cannot be proven. For compiler backends such as Cranelift, this often means tightening legalization rules, validating alignment before selecting an AMO/LR-SC sequence, and avoiding codegen that assumes unaligned atomic access is valid on RISC-V.
Understanding the Root Cause
On RISC-V 64-bit, atomic operations such as compare-and-swap are typically implemented using LR/SC sequences or other atomic instruction forms that require naturally aligned memory addresses. A 64-bit atomic operation expects the target address to be aligned to 8 bytes. If the compiler emits an atomic_cas.i64 against a location with weaker alignment, the generated machine code may trap at runtime.
Under QEMU, this manifests as a bus error because the emulator enforces architectural alignment requirements more strictly than some test environments might appear to. The issue often surfaces in reduced .clif test cases because those tests can construct IR that is technically valid at the intermediate representation layer but under-specified with respect to target-specific atomic alignment constraints.
Technically, the bug usually falls into one of these categories:
- The backend lowers atomic_cas.i64 directly to a native RISC-V atomic sequence without checking whether the address is 8-byte aligned.
- The IR or test case creates a stack slot, global, or derived pointer whose alignment is less than what the target atomic instruction requires.
- The legalization pass assumes that all 64-bit atomics on riscv64 are always legal, when in reality they are only legal for appropriately aligned addresses.
- A lowering bug emits an LR/SC loop for an operand coming from an address computation that loses alignment metadata.
In short: the crash happens because a 64-bit atomic compare-and-swap is being executed on memory that the RISC-V target considers misaligned.
Step-by-Step Solution
The most reliable fix is to make code generation alignment-aware and reject or legalize unsupported atomic forms before machine code emission.
1. Reproduce the failure with the minimal test case
Start by running the reduced test on riscv64 under QEMU and confirm the exact operation causing the trap.
cargo test -p cranelift-codegen atomic_cas_i64 -- --nocapture
If you have a standalone reproducer, run it with explicit riscv64 emulation:
qemu-riscv64 ./reproducer
If available, inspect the generated disassembly to verify whether the backend emitted an LR.D/SC.D loop or another 64-bit atomic sequence.
objdump -d ./reproducer
2. Inspect how the backend legalizes atomic_cas.i64
Find the lowering path for atomic_cas.i64 in the riscv64 backend. You want to answer two questions:
- Does the backend assume the operation is always legal?
- Is pointer alignment checked or preserved through lowering?
A problematic pattern looks like this conceptually:
if ty == I64 && target == riscv64 {
emit_native_atomic_cas(addr, expected, new);
}
A safer version introduces an alignment requirement:
if ty == I64 && target == riscv64 {
if known_alignment(addr) >= 8 {
emit_native_atomic_cas(addr, expected, new);
} else {
lower_to_libcall_or_trap_unsupported(addr, expected, new);
}
}
3. Enforce natural alignment for i64 atomics
If your backend has an instruction selector or legalization layer, add a guard that only permits native lowering when the memory access alignment is at least 8 bytes.
fn lower_atomic_cas_i64(addr: Value, expected: Value, new: Value) -> InstSeq {
let align = self.known_align_of(addr);
if align >= 8 {
self.emit_lr_sc_loop_64(addr, expected, new)
} else {
self.emit_atomic_fallback("i64_cas_unaligned", addr, expected, new)
}
}
If your compiler does not yet support an unaligned fallback for 64-bit atomics on riscv64, the correct near-term behavior is often to reject the form during legalization rather than emit crashing code.
if align < 8 {
return Err(CodegenError::Unsupported(
"riscv64 atomic_cas.i64 requires 8-byte alignment"
));
}
4. Preserve or strengthen alignment in stack and memory objects
If the failure comes from a stack slot or compiler-generated object, ensure the allocation itself is 8-byte aligned.
let slot = create_stack_slot(StackSlotData {
size: 8,
align: 8,
kind: StackSlotKind::ExplicitSlot,
});
For derived addresses, be careful that pointer arithmetic does not destroy provable alignment. For example, adding a non-multiple-of-8 offset to an 8-byte aligned base should make the result ineligible for native i64 atomic lowering.
5. Add a regression test for riscv64
Once fixed, add a test that ensures the backend does not emit native 64-bit atomic CAS for misaligned addresses. Depending on your test framework, this can be a legalization test, an instruction selection test, or a runtime test under QEMU.
test compile
set target riscv64
set opt_level=none
function %atomic_bad_align(i64, i64, i64, i64) -> i64 {
block0(v0: i64, v1: i64, v2: i64, v3: i64):
; v0 represents an intentionally misaligned address path
; Expect legalization failure or fallback, not native lr/sc emission.
}
If your framework supports negative expectations, assert that unsupported unaligned atomics are rejected cleanly instead of producing executable code that crashes.
6. Validate the fix in QEMU
After implementing the alignment-aware lowering, rerun the reproducer under QEMU and confirm one of these outcomes:
- Aligned cases execute successfully.
- Misaligned cases are lowered to a safe fallback.
- Unsupported cases fail at compile time with a clear diagnostic.
cargo test -p cranelift-codegen riscv64 -- --nocapture
qemu-riscv64 ./reproducer
7. Optional: improve diagnostics for future debugging
This class of issue is much easier to debug when the backend surfaces alignment assumptions explicitly. Add logging or verifier checks around atomic lowering.
debug_assert!(align >= 8, "atomic_cas.i64 on riscv64 requires 8-byte alignment");
A verifier-level message can save hours when another test later constructs an invalid atomic access path.
Common Edge Cases
- Stack slot alignment is weaker than expected: Even if the value type is i64, the stack object may not be guaranteed 8-byte alignment unless explicitly requested by the backend or ABI logic.
- Pointer arithmetic breaks alignment: A base pointer may start aligned, but adding an odd or non-8-byte offset can produce a misaligned effective address.
- Lost alignment metadata during optimization: Some IR transformations preserve type but not proven alignment facts, leading to overly aggressive atomic lowering later.
- Host vs emulator differences: A bug may remain hidden on one environment and fail immediately on QEMU, especially when the emulator faithfully traps on misaligned atomic accesses.
- Fallback path correctness: Replacing native CAS with a libcall or helper is only safe if the fallback preserves the required memory ordering semantics.
- Instruction availability assumptions: On RISC-V, backend logic must also respect whether the atomic extension is available and whether the selected sequence is valid for the configured target features.
FAQ
Why does this fail specifically on atomic_cas.i64 but not always on smaller atomics?
Because i64 atomics on riscv64 require 8-byte natural alignment. Smaller atomics may have weaker alignment requirements, so the same code pattern can appear to work for i32 while crashing for i64.
Is this a QEMU bug or a compiler backend bug?
Most often it is a compiler backend bug or missing legalization rule. QEMU is typically exposing a real architectural constraint by trapping on a misaligned atomic memory operation.
What is the safest fix if alignment cannot be proven?
The safest fix is to avoid native lowering for that operation. Either use a well-defined fallback that preserves atomic semantics or reject the unsupported case during compilation with a clear error.
To fully resolve the issue, treat atomic_cas.i64 on riscv64 as conditionally legal rather than universally legal. Once the backend requires verified 8-byte alignment before emitting native atomic instructions, the QEMU bus error disappears and the generated code becomes architecture-correct.