How to Fix: Cranelift: `fcvt_to_uint.i16` fails to compile on s390x with `has_vxrs_ext2`
Cranelift on s390x: fixing fcvt_to_uint.i16 compilation when has_vxrs_ext2 is enabled
This failure is a classic instruction-selection legality bug: enabling VXRS_EXT2 changes the lowering path for fcvt_to_uint.i16 on s390x, but that path implicitly depends on MIE2. When has_vxrs_ext2 is present without the required scalar support, Cranelift chooses an implementation that the backend cannot legally encode, so compilation fails.
Understanding the Root Cause
The problematic operation is fcvt_to_uint.i16: converting a floating-point value into an unsigned 16-bit integer. On s390x, this is not always a single native instruction. In practice, the backend may legalize it through a sequence that relies on a specific combination of ISA capabilities.
The bug appears only when has_vxrs_ext2 is enabled because that feature changes backend behavior in one of two common ways:
- it unlocks a vector-aware lowering rule, or
- it changes legality decisions so the compiler prefers a newer conversion sequence.
However, the selected sequence appears to require has_mie2 as well. If the legalization or instruction emission logic checks only has_vxrs_ext2 and not has_mie2, Cranelift can reach a state where:
- the IR operation is considered lowerable,
- the chosen lowering path emits instructions or patterns gated by MIE2,
- the backend later discovers the target does not actually support that exact sequence,
- compilation aborts.
That explains why the same code compiles without has_vxrs_ext2: the backend falls back to an older or more conservative path that does not require the missing extension combination.
So the root cause is not the conversion itself. The root cause is a feature-gating mismatch between:
- legalization/lowering selection, and
- actual instruction requirements.
In short: fcvt_to_uint.i16 is being routed into a path that should be guarded by has_mie2, not just has_vxrs_ext2.
Step-by-Step Solution
The fix is to make the lowering rule for fcvt_to_uint.i16 require the correct ISA feature set. In most Cranelift backend bugs of this kind, the safest repair is to tighten the predicate where the conversion is legalized or emitted.
1. Locate the s390x lowering or legalization rule
Search the backend for the conversion and related feature predicates.
git grep "fcvt_to_uint
git grep "has_vxrs_ext2"
git grep "has_mie2"
Typical places to inspect include:
- s390x ISLE lowering rules
- s390x instruction definitions
- shared legalization code for float-to-int conversions
- predicate helpers that map ISA flags into lowering decisions
2. Identify the path selected when has_vxrs_ext2 is enabled
Look for a rule that matches fcvt_to_uint.i16 or a more generic unsigned float-to-int conversion that eventually narrows to i16. You are looking for logic similar in spirit to this:
; pseudo-logic, not exact Cranelift syntax
if has_vxrs_ext2 {
lower_fcvt_to_uint_i16_with_newer_sequence()
} else {
lower_fcvt_to_uint_i16_with_fallback()
}
If the newer sequence uses instructions that require MIE2, the condition is too weak.
3. Tighten the feature gate
Update the rule so the optimized path is only available when both capabilities are present, or otherwise explicitly gate it on the actual required feature.
; preferred pseudo-fix
if has_vxrs_ext2 && has_mie2 {
lower_fcvt_to_uint_i16_with_newer_sequence()
} else {
lower_fcvt_to_uint_i16_with_fallback()
}
If the operation does not fundamentally need VXRS_EXT2 but only MIE2, simplify the predicate further:
; alternative pseudo-fix
if has_mie2 {
lower_fcvt_to_uint_i16_with_supported_sequence()
} else {
lower_fcvt_to_uint_i16_with_fallback()
}
The exact choice depends on whether the emitted instructions are:
- vector-specific and also need MIE2, or
- primarily MIE2-dependent regardless of VXRS_EXT2.
4. Keep the fallback path legal
Make sure the fallback path remains available for targets that have neither extension combination. A safe legalization sequence often looks like:
- convert to a wider integer type that is supported,
- clamp or trap according to Cranelift semantics,
- narrow to
i16.
Conceptually:
; generic fallback strategy
v = fcvt_to_uint.i32 x
v = uextend_or_clamp(v)
r = ireduce.i16 v
The exact sequence must preserve Cranelift semantics for:
- out-of-range floats,
- NaN handling,
- rounding mode assumptions,
- trapping vs saturating behavior, depending on the IR op variant.
5. Add a regression test
This bug should be locked down with a backend-specific test covering the failing feature combination. Add one test that reproduces the failure and another that verifies the fallback still works when has_vxrs_ext2 is absent.
; pseudo test idea
; target = s390x
; flags = has_vxrs_ext2
function %test(f64) -> i16 {
block0(v0: f64):
v1 = fcvt_to_uint.i16 v0
return v1
}
Then add a variant with both features enabled, and if your test harness supports it, a variant with only has_mie2 to verify legality boundaries.
6. Validate with targeted backend runs
Re-run the exact reproducer and then broaden coverage to ensure no neighboring conversions broke.
cargo test -p cranelift-codegen s390x
cargo test -p cranelift-codegen fcvt
cargo test -p cranelift-codegen isa::s390x
If there is a fuzzer reproducer tied to the issue, include it in the regression suite so future ISA-flag changes cannot reintroduce the bug.
7. Example patch shape
The final change often ends up looking like this at a high level:
// pseudo-Rust / pseudo-lowering example
match op {
FcvtToUintI16 => {
if isa.has_mie2() && isa.has_vxrs_ext2() {
emit_fast_sequence();
} else {
emit_legal_fallback();
}
}
_ => { /* existing logic */ }
}
Or, in a declarative lowering rule:
; pseudo-ISLE shape
(rule (lower (fcvt_to_uint_i16 x))
(if (and (has_vxrs_ext2) (has_mie2)))
(fast-lower x))
(rule (lower (fcvt_to_uint_i16 x))
(fallback-lower x))
The important part is consistency: the same feature assumptions must hold from legalization through final emission.
Common Edge Cases
1. Other integer widths may be affected
If i16 is broken due to an incorrect shared conversion rule, i8, i32, or even vector lane conversions may be vulnerable too. Audit sibling operations such as:
fcvt_to_uint.i8fcvt_to_uint.i32fcvt_to_sint.i16- narrowing conversions implemented through shared helper code
2. Signed and unsigned conversions may diverge
Unsigned float-to-int conversion is usually trickier than signed conversion because of range handling near zero and the upper bound. A fix that works for fcvt_to_sint.i16 may still be wrong for fcvt_to_uint.i16.
3. Fuzzer-only failures can hide real backend bugs
If the issue first appeared while adding ISA extensions to the fuzzer, it is tempting to treat it as a test harness anomaly. It usually is not. Fuzzers expose invalid feature-cross-product assumptions that production code can eventually hit too.
4. Predicate duplication can reintroduce the bug
If one layer checks has_mie2 but another still checks only has_vxrs_ext2, the backend may remain inconsistent. Review:
- instruction selection predicates
- encoding predicates
- legalization helper predicates
- canonicalization rules that rewrite into the problematic form
5. Trap semantics must remain correct
Some float-to-int operations trap or have defined behavior on invalid inputs depending on the exact Cranelift opcode and lowering strategy. Do not replace a trapping conversion with a saturating sequence unless the IR semantics permit it.
FAQ
Why does the bug appear only when has_vxrs_ext2 is enabled?
Because that feature changes the chosen lowering or legalization path. The backend starts using a sequence that assumes more ISA support than is actually guaranteed. Without has_vxrs_ext2, Cranelift stays on a conservative fallback path that remains legal.
Is adding has_mie2 to the guard enough?
Usually yes, if the emitted instruction sequence truly depends on MIE2. But verify whether the conversion path also requires other predicates, such as vector support, exact FP conversion capabilities, or a helper instruction family used during narrowing.
Should this be fixed in the fuzzer or in the s390x backend?
The backend. The fuzzer is doing its job by exposing an invalid feature combination assumption. Even if the fuzzer later constrains combinations, the lowering rule should still accurately describe the backend’s real instruction requirements.
The practical fix is straightforward: audit the s390x lowering for fcvt_to_uint.i16, identify the path unlocked by has_vxrs_ext2, and gate it with has_mie2 if that is the actual dependency. Once you add a regression test for that feature combination, this class of bug becomes much harder to reintroduce.