How to Fix: Cranelift: Register allocation error for `fcmp` + `brz` on riscv64
Cranelift riscv64 Register Allocation Failure with fcmp Followed by brz: Root Cause and Fix
This crash happens because the IR produces a boolean-like result from fcmp and then immediately feeds it into brz, but on riscv64 that value can land in a register class or constraint combination the register allocator cannot legally satisfy. The failure is not really in floating-point comparison semantics alone; it is in the boundary between SSA value typing, branch lowering, and target-specific register classes.
Table of Contents
Reproducing the Problem
A minimal failing .clif test looks like this:
test compile
target riscv64
function u1:0() {
block0:
v0 = f64const 0.0
v1 = fcmp eq v0, v0
brz v1, block1
jump block1
block1:
return
}
The problematic sequence is:
v1 = fcmp eq v0, v0
brz v1, block1
On some backends this is harmless. On riscv64, however, the compare result and branch expectation may not agree on where that value must live.
Understanding the Root Cause
The core issue is a register-class mismatch during lowering and allocation.
In Cranelift IR, fcmp returns a scalar condition value, often modeled as a boolean integer-like SSA value. But the backend still has to decide how to materialize that result on the target machine. On RISC-V, floating-point comparisons are implemented with instructions that produce results usable by integer control flow, not floating-point branch operands. If the lowering path leaves the compare result in a form that implies one class of register while brz requires another, the register allocator sees an impossible constraint graph.
More concretely:
- fcmp starts from floating-point operands.
- The result is conceptually boolean, but boolean values in Cranelift still need a legal machine representation.
- brz expects a branchable zero-test operand in the proper GPR form.
- If the compare result is not normalized into the expected integer register class before branching, regalloc can fail.
This is why the issue appears specifically with the combination of fcmp + brz, not necessarily with either instruction in isolation.
Another subtle point is that some architectures support direct floating-point condition branching or have lowering paths that hide this conversion step. riscv64 is stricter: the compare result must be represented in a backend-legal way before a zero/non-zero branch consumes it.
Step-by-Step Solution
The safest fix is to ensure the fcmp result is converted or lowered into a legal integer value before feeding it to brz, or to lower the whole compare-and-branch sequence as a single target-specific branch pattern.
1. Confirm the failing pattern
Search for IR or legalization output containing this exact shape:
vX = fcmp ...
brz vX, blockY
If you are working inside Cranelift backend code, inspect the legalization and ISLE lowering path for floating-point comparisons on riscv64.
2. Rewrite the IR pattern to force an integer branch operand
If you control the IR generation, transform the boolean result into a known integer-compatible form before branching.
test compile
target riscv64
function u1:0() {
block0:
v0 = f64const 0.0
v1 = fcmp eq v0, v0
v2 = bint.i8 v1
brz v2, block1
jump block1
block1:
return
}
The exact conversion opcode may depend on the available Cranelift IR operations in your pipeline, but the idea is consistent: materialize the condition as an integer value before using integer branch instructions.
3. Prefer target-aware compare+branch lowering in the backend
If you are fixing Cranelift itself, the more robust solution is in backend lowering rather than test-level rewriting. Match:
fcmp cond, a, b
brz result, block
into a legal sequence that produces a GPR boolean or directly emits the right branch structure.
Pseudocode for a backend-side strategy:
// Pseudocode
cmp = lower_fcmp_to_int_result(a, b) // ensure GPR-compatible result
branch_if_zero(cmp, block1)
Or fold it further:
// Better target-specific lowering
if !(a == b) goto fallthrough
goto block1
This avoids leaving an ambiguous temporary value for regalloc to repair later.
4. Update legalization or ISLE rules
If the backend uses ISLE or legalization passes, add a rule ensuring the result of floating-point comparison is legalized into an integer register class when consumed by brz/brnz.
; Conceptual lowering rule
(rule (lower (brz (fcmp eq x y) block))
(let ((tmp (rv_fcmp_to_gpr_eq x y)))
(rv_brz tmp block)))
The exact syntax will differ, but the requirement is the same: do not let a branch consume a compare result whose register class is still backend-ambiguous.
5. Add a regression test
Once fixed, keep a dedicated regression test so the allocator failure does not return later.
test compile
target riscv64
function %fcmp_brz_regalloc_guard() {
block0:
v0 = f64const 0.0
v1 = fcmp eq v0, v0
v2 = bint.i8 v1
brz v2, block1
jump block2
block1:
return
block2:
return
}
This test should compile successfully and specifically cover the floating compare to branch pipeline on riscv64.
6. Validate with the compiler test suite
After patching, run the relevant backend and regalloc tests, including any targeted Cranelift compile tests for riscv64. If your local workflow includes broader Wasmtime verification, run those too to catch related lowering issues.
Common Edge Cases
- NaN behavior: fcmp eq is sensitive to IEEE floating-point semantics. Rewriting compare logic must preserve unordered behavior correctly.
- Different branch forms: brnz may hit the same allocator bug if it consumes the same kind of compare result.
- Other float widths: f32 and f64 may share lowering infrastructure, so test both.
- Condition reuse: If the result of fcmp is used both in arithmetic/select logic and branching, your legalization must keep all uses legal, not only the branch use.
- Block parameter interactions: More complex CFGs can expose additional allocator pressure when the compare result crosses blocks.
- Optimization pass rewrites: A fix in one phase can be undone if a later pass reconstructs the illegal fcmp -> brz pattern.
FAQ
Why does this fail on riscv64 but not always on other architectures?
Because riscv64 has stricter or different expectations around how floating-point compare results are represented for branch instructions. Other backends may already normalize the result into a legal integer register or support a more direct lowering pattern.
Is the bug in fcmp or in brz?
Usually neither instruction is independently wrong. The bug is in the interface between compare lowering, value legalization, and register allocation. The compare result is not being made legal for the branch consumer on this target.
What is the best long-term fix in Cranelift?
The best long-term fix is a backend-level legalization or lowering rule that guarantees any floating-point comparison result used by brz/brnz becomes a valid GPR-backed boolean, or that the compare and branch are lowered together as one legal target-aware sequence.
In short, the fix is to stop treating the result of fcmp as an abstract boolean when riscv64 branch lowering needs a concrete integer-register value. Once that contract is enforced, the register allocation error disappears and the test becomes stable.