How to Fix: Cranelift: `smin.i128` wrong result on riscv64

6 min read

Cranelift on riscv64: why smin.i128 returns the wrong result when a v0 argument is present

This bug is a classic ABI and calling-convention mismatch: the generated code for riscv64 mishandles a 128-bit scalar comparison in the presence of a leading vector argument, so smin.i128 reads or reconstructs the wrong value and produces an incorrect minimum. The reason the test passes when the v0 argument is removed is that argument placement changes, and the broken path is no longer exercised.

Reproducing the failure

The issue appears in a Cranelift .clif test targeting riscv64, where smin.i128 returns the wrong result only when a v0 argument is present in the function signature. Conceptually, the failing pattern looks like this:

test interpret
test run
target riscv64

function %f(v0: i8x16, a: i128, b: i128) -> i128 {
block0(v0: i8x16, a: i128, b: i128):
    r = smin a, b
    return r
}

If you remove v0 from the parameter list, the same logic often succeeds. That strongly suggests the failure is not in the high-level meaning of smin, but in lowering, register assignment, or argument reconstruction for mixed-type parameters on riscv64.

Understanding the Root Cause

On RISC-V 64-bit, a value of type i128 is not a native single-register scalar. It must be represented as two 64-bit halves. That means any operation such as signed minimum must first compare the high halves as signed values, then compare the low halves if the high halves are equal, and finally select the correct 128-bit pair.

The bug is typically caused by one of these backend mistakes:

  • Incorrect argument slot assignment when a vector argument appears before integer arguments.
  • Wrong register class interaction between vector-capable arguments and split scalar arguments.
  • Bad reconstruction of i128 from two machine-level pieces after ABI lowering.
  • Selection logic using the wrong halves because virtual registers are mapped incorrectly after the presence of v0.

Why does removing v0 make the test pass? Because the function signature changes the machine-level calling layout. The backend may place the two halves of each i128 into a different register/stack sequence, avoiding the broken path. In other words, the bug is signature-sensitive, not operation-sensitive.

At a lower level, smin.i128 on a target without native 128-bit scalar min usually lowers into logic like:

if a_hi < b_hi: return a
if a_hi > b_hi: return b
if a_lo < b_lo (unsigned compare when highs equal): return a
return b

If a_hi, a_lo, b_hi, or b_lo are sourced from the wrong physical location because of ABI lowering around the vector argument, the result is deterministic but wrong.

Step-by-Step Solution

The fix is to ensure that riscv64 lowering and ABI argument handling correctly preserve split i128 values even when vector arguments are present earlier in the signature.

1. Add a minimal regression test

Start by preserving a compact reproducer in the Cranelift test suite.

test interpret
test run
target riscv64

function %smin_i128_with_v0(v0: i8x16, x: i128, y: i128) -> i128 {
block0(v0: i8x16, x: i128, y: i128):
    r = smin x, y
    return r
}

Also add the control case without the vector argument:

test interpret
test run
target riscv64

function %smin_i128_without_v0(x: i128, y: i128) -> i128 {
block0(x: i128, y: i128):
    r = smin x, y
    return r
}

This gives you one test that fails today and one that demonstrates the layout dependency.

2. Inspect ABI lowering for mixed vector and i128 arguments

Review the riscv64 ISA backend code responsible for assigning argument locations. You want to verify that:

  • the vector argument consumes the correct location according to Cranelift’s ABI model,
  • the following i128 values are split into exactly two 64-bit pieces,
  • piece ordering is consistent, and
  • reassembly uses the same order expected by compare/select lowering.

In practice, check logic around argument legalization and lowering for signatures like:

(vector, i128, i128) -> i128

If your backend has debug output for legalized signatures or assigned argument locations, enable it and compare the failing and passing signatures.

3. Verify piece ordering of i128 values

A common source of this bug is swapped high/low halves. Ensure the backend consistently treats the 128-bit value as:

x = { lo: x_lo, hi: x_hi }
y = { lo: y_lo, hi: y_hi }

Then signed minimum lowering should conceptually behave like:

if x_hi < y_hi:
    return x
elif x_hi > y_hi:
    return y
elif x_lo < y_lo:
    return x
else:
    return y

Be careful: the final low-half comparison when high halves are equal is usually a plain magnitude comparison over the low 64 bits, because sign has already been decided by the high half.

4. Fix register/stack mapping for split arguments

If the issue comes from assigned locations after a vector argument, update the argument placement logic so the two halves of each i128 remain adjacent and correctly typed through lowering.

The implementation work usually falls into one of these patterns:

  • adjusting signature legalization for i128 after vector arguments,
  • correcting how ABI params are expanded into machine-level values,
  • fixing value-list to register mapping during instruction selection,
  • or repairing the compare/select sequence to read the legalized halves in the right order.

A conceptual before/after can look like this:

// Wrong conceptual mapping after v0:
x_hi = arg1
x_lo = arg2
y_hi = arg3
y_lo = arg4

// Correct conceptual mapping:
x_lo = arg1
x_hi = arg2
y_lo = arg3
y_hi = arg4

The exact mapping depends on Cranelift’s internal ABI conventions, but the key is consistency across legalization, lowering, and final codegen.

5. Re-run interpreter and native tests

Once fixed, validate both semantic and backend behavior:

cargo test -p cranelift-codegen
cargo test -p cranelift-filetests
cargo test smin_i128 --workspace

If your environment supports targeted filetests, run the specific riscv64 test file that contains the reproducer.

6. Add broader regression coverage

Do not stop at smin.i128. Add neighboring tests for operations that use the same split-value machinery:

umax.i128
umin.i128
smax.i128
icmp.i128
select over i128 values

And vary the signature shape:

(v0, i128, i128)
(i128, v0, i128)
(i128, i128, v0)
(v0, v1, i128, i128)

This helps catch any broader calling convention bug instead of only the single visible symptom.

Common Edge Cases

  • Negative 128-bit inputs: signed comparisons fail most visibly when the high 64-bit half carries the sign bit.
  • Equal high halves: if high halves match, the implementation must compare low halves correctly; many split-compare bugs hide here.
  • Stack-passed arguments: once enough arguments are present, some values may spill to the stack, exposing a second ABI bug.
  • Multiple vector arguments: adding more vector params can shift subsequent scalar locations and reveal more incorrect mappings.
  • Return-value reconstruction: even if compare logic is fixed, returning the chosen i128 can still be broken if the output halves are emitted in the wrong order.
  • Interpreter vs native mismatch: if test interpret and test run disagree, the problem may involve backend codegen rather than IR semantics.

FAQ

Why does the bug only happen when v0 is present?

Because v0 changes how the backend assigns machine-level argument locations. That shifts or reclassifies the following i128 arguments, exposing a bug in ABI lowering or split-value reconstruction.

Is smin.i128 itself broken on riscv64?

Usually not in the abstract IR sense. The issue is more likely in how Cranelift lowers i128 values to machine-level pieces on riscv64, especially in mixed signatures with vector parameters.

What is the safest long-term fix?

Add a backend fix plus regression tests covering mixed vector and i128 signatures, multiple parameter orders, and related 128-bit compare/select operations. That prevents future changes from reintroducing the same class of bug.

The practical takeaway is simple: this is a riscv64 backend correctness issue triggered by mixed argument types. Fix the legalization or ABI mapping for split i128 values in the presence of v0, then lock it down with targeted filetests.

Leave a Reply

Your email address will not be published. Required fields are marked *