How to Fix: Cranelift: RISC-V wrong result for `fcvt_to_uint_sat.{i8,i16}`
Cranelift RISC-V wrong result for fcvt_to_uint_sat.{i8,i16}: root cause and fix
This bug is a classic code generation mismatch: the CLIF operation promises a saturating float-to-unsigned-int conversion, but the generated RISC-V sequence can produce the wrong value for narrow integer targets like i8 and i16. The failure usually appears when converting negative floats, NaNs, or values larger than the destination range, where saturating semantics must clamp to the legal unsigned range instead of leaking sign-extended or incorrectly narrowed results.
Problem overview
The issue centers on Cranelift lowering for fcvt_to_uint_sat.i8 and fcvt_to_uint_sat.i16 on riscv64gc. A CLIF test like the following should clamp results into the unsigned destination range:
test interpret
test run
target riscv64gc
function %a_f32(f32) -> i16 {
block0(v0: f32):
v1 = fcvt_to_uint_sat.i16 v0
return v1
}
Expected semantics for fcvt_to_uint_sat.i16 are effectively:
if input is NaN: return 0
if input <= 0: return 0
if input >= 65535: return 65535
otherwise: return trunc(input)
For i8, the same logic applies with the range 0..255. If the backend uses a wider conversion instruction and then narrows incorrectly, the final result may no longer match those semantics.
Understanding the Root Cause
The core problem is that RISC-V floating-point conversion instructions do not directly implement CLIF’s full saturating narrow unsigned conversion semantics for every destination width. On a 64-bit target, lowering often starts from a wider integer conversion such as a conversion into a 64-bit register, followed by masking, narrowing, or extension. That is where things go wrong.
There are two technical traps:
- Unsigned saturation is not the same as convert-then-truncate. If the backend converts a float to a larger integer and then simply takes the low 8 or 16 bits, out-of-range values can wrap instead of clamp. For example, a value that should saturate to 65535 might become a larger integer first and then narrow to an unrelated 16-bit result.
- Sign and extension behavior can corrupt narrow results. In Cranelift IR, a result typed as i8 or i16 still lives in machine registers that may be 64 bits wide. If lowering uses an instruction path that produces a signed interpretation, or if the narrowing step leaves the register in a sign-extended state, values near the unsigned maximum can appear negative or otherwise incorrect when observed through later instructions.
More concretely, the buggy lowering usually looks like this conceptually:
wide = convert_float_to_int_somehow(input)
narrow = low_bits(wide)
return narrow
That sequence is insufficient because saturation must happen before narrowing semantics are finalized. The backend must explicitly clamp to 0..255 or 0..65535, not rely on hardware behavior that only guarantees part of the required semantics.
Another subtle point is handling of NaN and infinities. CLIF’s saturating conversion requires deterministic clamping behavior. Hardware conversions may set flags, return architecture-specific values, or require additional comparisons. If the lowering path assumes the hardware conversion alone is enough, edge inputs will fail.
Step-by-Step Solution
The safest fix is to lower fcvt_to_uint_sat.i8 and fcvt_to_uint_sat.i16 through an explicit clamp sequence in the RISC-V backend instead of depending on a direct narrow conversion path.
1. Confirm the failing behavior with a targeted CLIF test
Add or extend tests that exercise negative values, NaN, upper-bound overflow, and boundary-adjacent values.
test interpret
test run
target riscv64gc
function %u16_cases(f32) -> i16 {
block0(v0: f32):
v1 = fcvt_to_uint_sat.i16 v0
return v1
}
function %u8_cases(f32) -> i8 {
block0(v0: f32):
v1 = fcvt_to_uint_sat.i8 v0
return v1
}
Make sure your run directives cover inputs like:
-1.0
0.0
1.0
254.9
255.0
255.9
65534.9
65535.0
65535.9
nan
inf
-inf
2. Inspect the existing lowering path
In the RISC-V backend, locate the lowering or instruction selection for fcvt_to_uint_sat. Look for logic that:
- reuses a wider u32 or u64 conversion, then narrows;
- uses sign-extending paths for narrow integer results;
- omits explicit min/max clamping for i8 and i16.
If the implementation depends on a generic helper shared with non-saturating conversions, that is a strong sign the backend is skipping required saturation logic.
3. Lower the operation as explicit saturating logic
Implement the semantics as:
if isnan(x): return 0
if x <= 0.0: return 0
if x >= MAX: return MAX
return trunc_toward_zero(x)
Where MAX is:
255.0 for i8
65535.0 for i16
In pseudo-lowering form:
tmp0 = fcmp_unordered(x, x) ; NaN check
if tmp0 goto ret_zero
tmp1 = fcmp_le(x, 0.0)
if tmp1 goto ret_zero
tmp2 = fcmp_ge(x, MAX_FLOAT)
if tmp2 goto ret_max
wide = fcvt_to_uint(x) ; use a safe wider conversion only after range checks
narrow = uextend_or_mask_to_width(wide)
return narrow
ret_zero:
return 0
ret_max:
return MAX_INT
The key idea is simple: perform range checks before integer narrowing. Once the float is known to be within the legal range, a wider integer conversion followed by a clean low-bit extraction is correct.
4. Be careful with the final register form
After conversion, ensure the result is represented consistently as an unsigned narrow value in the machine register pipeline. Depending on the backend conventions, that may mean using a mask instead of a sign-sensitive narrowing instruction.
; for i8
result = wide & 0xff
; for i16
result = wide & 0xffff
This avoids accidental sign extension when the value is carried in a larger register.
5. Add backend tests for both correctness and regression coverage
Do not stop at one CLIF reproducer. Add tests for both widths and multiple input classes.
; expected examples for u8 semantics
; -1.0 -> 0
; 0.0 -> 0
; 1.9 -> 1
; 255.0 -> 255
; 300.0 -> 255
; NaN -> 0
; expected examples for u16 semantics
; -1.0 -> 0
; 42.7 -> 42
; 65535.0 -> 65535
; 70000.0 -> 65535
; NaN -> 0
6. Validate with Cranelift test commands
Run the relevant Cranelift tests and backend-specific suites after the patch. If your local setup includes filetests and interpreter comparisons, use both so you verify semantics independently of native lowering.
cargo test -p cranelift-codegen
cargo test -p cranelift-filetests
If you have a targeted filetest path, run that repeatedly while iterating on lowering changes.
7. Example patch strategy
A practical patch often follows this pattern:
match opcode {
FcvtToUintSatI8 => lower_fcvt_to_uint_sat_narrow(src, 8),
FcvtToUintSatI16 => lower_fcvt_to_uint_sat_narrow(src, 16),
_ => ...
}
fn lower_fcvt_to_uint_sat_narrow(src, bits) -> Value {
let max_f = if bits == 8 { 255.0 } else { 65535.0 };
let max_i = if bits == 8 { 0xff } else { 0xffff };
if is_nan(src) || src <= 0.0 {
return 0;
}
if src >= max_f {
return max_i;
}
let wide = convert_in_range_float_to_uint(src);
return wide & max_i;
}
The exact APIs differ across Cranelift internals, but the semantic structure should remain the same.
Common Edge Cases
Even after the main fix, several corner cases are worth checking.
NaN handling
NaN must return 0 for saturating unsigned conversion. If the backend uses only ordered comparisons, NaN can bypass normal branches and fall into an undefined or architecture-specific conversion path.
Negative zero
-0.0 should still produce 0. Usually this is handled automatically by a <= 0.0 check, but it is worth testing explicitly.
Infinity
+inf must saturate to the unsigned maximum, while -inf must become 0. If the implementation relies on finite-range assumptions, infinities can slip through unexpectedly.
Boundary values near the max
Inputs like 254.999, 255.0, 255.001, 65534.999, and 65535.001 are crucial. The backend must truncate toward zero only for in-range values and clamp once the threshold is crossed.
Sign-extended use sites
Sometimes the conversion itself is correct, but a later instruction interprets the i8 or i16 result as signed because the register was not normalized. If tests still fail after the clamp fix, inspect downstream uses and ensure the value remains properly zero-extended or masked.
Shared lowering helpers
If i8 and i16 use a generic helper also shared with i32 or i64, verify that the helper does not assume wide-register semantics that are invalid for narrow saturated conversions.
FAQ
Why does this bug affect i8 and i16 more than wider integer sizes?
Because the dangerous step is usually the final narrowing. Wider targets like i32 or i64 often map more directly onto hardware conversion instructions, while narrow unsigned results require extra masking, clamping, or extension rules.
Can this be fixed by just masking the converted result?
No. Masking alone is not saturation. It only keeps the low bits, which can turn out-of-range values into wrapped results. You must first clamp the float to the legal destination range, then convert, then normalize the final width.
Should the fix be implemented in legalization or in backend lowering?
Either can work, but for this issue the most direct solution is usually in RISC-V backend lowering, where you can precisely control emitted comparisons, conversion instructions, and final masking behavior. If multiple backends share the same semantic gap, moving the expansion earlier into legalization may be cleaner.
In short, the correct fix is to stop treating fcvt_to_uint_sat.i8 and fcvt_to_uint_sat.i16 as simple narrow versions of a wider conversion. They require explicit saturating range checks, a safe in-range conversion, and a final zero-preserving narrow result. Once those steps are enforced, the RISC-V output matches CLIF semantics and the regression tests become stable.