How to Fix: Cranelift: `sqmul_round_sat.{i16,i32}` does not exist for scalars

Updated June 10, 2026 6 min read

Aldawsari

6 min read

Cranelift `sqmul_round_sat.{i16,i32}` Does Not Exist for Scalars: Why the Docs Mislead and How to Fix Your IR

The failure is real: Cranelift does not provide scalar forms of sqmul_round_sat.i16 and sqmul_round_sat.i32, even though the generated API docs can make it look like those forms are available. If you try to build scalar IR with this instruction, you will hit verification errors, legalization failures, or backend-specific lowering problems because the instruction is defined for SIMD vector lanes, not standalone scalar integer values.

Table of Contents

Understanding the Root Cause
Step-by-Step Solution
Common Edge Cases
FAQ

This issue is best understood as a documentation mismatch: the Rust method generated on InstBuilder appears generic enough to suggest scalar use, but the underlying instruction constraints in Cranelift only permit specific lane-based vector types.

Understanding the Root Cause

Cranelift instructions are defined in the ISA-independent IR with strict type constraints. The method sqmul_round_sat is intended for signed saturating rounded doubling multiply high-half semantics, a pattern commonly associated with SIMD arithmetic on targets such as AArch64. The confusion happens because the generated Rust docs expose a builder function, but they do not always communicate the full set of legal type forms clearly enough.

In practice, the relevant forms are vector lane operations such as:

i16xN
i32xN

What does not exist are scalar forms like:

i16
i32

That means this kind of IR is invalid in concept:

let x = builder.ins().iconst(types::I16, 1234.into());
let y = builder.ins().iconst(types::I16, 5678.into());
let z = builder.ins().sqmul_round_sat(x, y);

Even if the Rust API lets you write the call, Cranelift still validates the instruction against its declared type system. Since x and y are scalars, there is no matching instruction encoding or legal IR form for scalar sqmul_round_sat.i16.

Technically, this happens because the operation is modeled as a lane-wise vector instruction. Each lane performs the saturating rounded multiply independently. Cranelift does not automatically reinterpret a scalar as a one-lane vector, and there is no scalar fallback form defined for this opcode.

So the root cause is:

The generated docs are broader than the actual legal type set.
The opcode is defined for vector lane types, not scalar integer types.
Cranelift IR verification and legalization enforce the real type constraints.

Step-by-Step Solution

You have two correct paths depending on what you are trying to do: use a vector type if you truly want this instruction, or rewrite the operation manually if your data is scalar.

1. Verify Whether You Are Using Scalars or Vectors

Check the types of the values you pass into sqmul_round_sat. If they are types::I16 or types::I32, that is the bug.

let a: Value = ...;
let b: Value = ...;
let ty = builder.func.dfg.value_type(a);
assert_eq!(ty, builder.func.dfg.value_type(b));

If ty is scalar, do not use sqmul_round_sat.

2. Use a Supported Vector Type

If your algorithm is naturally SIMD-friendly, switch to a vector lane type supported by Cranelift.

use cranelift_codegen::ir::types;

let a = ...; // type: i16x8, i16x16, i32x4, etc.
let b = ...; // same vector type
let result = builder.ins().sqmul_round_sat(a, b);

The exact vector shape depends on your target and how your frontend constructs the IR, but the key point is that both operands must be the same supported SIMD vector type.

3. Replace Scalar Usage with Equivalent Scalar IR

If you need scalar behavior, implement the arithmetic manually using a wider intermediate type, explicit rounding, shifting, and saturation. The exact formula depends on the semantics you need, but the general pattern is:

// Pseudocode-like Cranelift IR strategy for scalar i32 inputs:
// 1. Widen to i64
// 2. Multiply
// 3. Apply doubling if required by the semantic
// 4. Add rounding constant
// 5. Shift right
// 6. Saturate back to i32

let a64 = builder.ins().sextend(types::I64, a32);
let b64 = builder.ins().sextend(types::I64, b32);
let prod = builder.ins().imul(a64, b64);
let doubled = builder.ins().iadd(prod, prod);
let round = builder.ins().iconst(types::I64, 1_i64 << 30);
let rounded = builder.ins().iadd(doubled, round);
let shifted = builder.ins().ushr_imm(rounded, 31);

// Then clamp to i32 bounds before reducing.
let i32_min = builder.ins().iconst(types::I64, i32::MIN as i64);
let i32_max = builder.ins().iconst(types::I64, i32::MAX as i64);
let clamped_hi = builder.ins().smin(shifted, i32_max);
let clamped = builder.ins().smax(clamped_hi, i32_min);
let result = builder.ins().ireduce(types::I32, clamped);

This example illustrates the shape of the solution, not a drop-in mathematical replacement for every use case. You must match the exact rounding and saturation semantics expected by your frontend or source ISA.

4. Update Documentation or Internal Comments

If you maintain a compiler frontend, DSL lowering layer, or bindings around Cranelift, document this explicitly so future contributors do not assume scalar support exists.

// Important: sqmul_round_sat is only valid for SIMD vector lane types.
// Do not call it with scalar I16 or I32 values.

5. Validate Early

Add an assertion before emitting the instruction so the failure is immediate and descriptive.

let ty = builder.func.dfg.value_type(lhs);
assert!(ty.is_vector(), "sqmul_round_sat requires a SIMD vector type");
let out = builder.ins().sqmul_round_sat(lhs, rhs);

This is especially useful in frontend code generators where type errors otherwise surface much later during verification.

Common Edge Cases

Accidentally Using a Scalar After Extracting a Lane

If you extract a lane from a vector, the result becomes a scalar. You cannot feed that scalar back into sqmul_round_sat.

let lane = builder.ins().extractlane(vec, 0); // scalar now
// Invalid: builder.ins().sqmul_round_sat(lane, lane)

Use scalar arithmetic after extraction, or keep the computation vectorized.

Assuming a One-Lane Vector Is Equivalent to a Scalar

Cranelift does not generally treat a scalar integer as an implicit one-lane SIMD vector. If your frontend relies on that assumption, type legalization will fail.

Backend Differences

Some target architectures may have instructions that look similar for scalar math, but Cranelift IR legality is determined by the IR definition, not by what one backend might theoretically support. Even if a backend could lower a related scalar sequence, the IR opcode still remains vector-only.

Incorrect Manual Saturation

When replacing the instruction with scalar IR, the biggest mistakes are:

Using the wrong rounding constant
Using logical shift where arithmetic behavior is required, or vice versa
Failing to clamp before narrowing
Overflowing in the intermediate multiply because the type was not widened enough

Confusing Signed and Unsigned Semantics

sqmul_round_sat is a signed operation. If your source operation is unsigned or mixed-sign, a manual replacement must account for that explicitly.

FAQ

Why do the docs make it look like scalar `i16` and `i32` are allowed?

The Rust API docs are generated from instruction builders and do not always expose the full type restriction clearly at the call site. The true source of legality is Cranelift’s instruction definition and verifier, which restrict this opcode to vector lane types.

Can I safely emulate scalar `sqmul_round_sat` myself?

Yes, but only if you reproduce the exact rounding, doubling, high-half, and saturation semantics required by your source operation. Use a wider intermediate integer type and clamp before narrowing.

Should this be fixed in Cranelift code or just in documentation?

For this specific issue, it is primarily a documentation issue. The instruction behavior is consistent; the mismatch is that the generated docs are not explicit enough about scalar forms being unavailable.

The practical fix is simple: do not emit sqmul_round_sat for scalar i16 or i32 values. Use SIMD vectors when appropriate, and for scalar code, lower the operation manually with widening, rounding, shifting, and saturation. That aligns your IR with what Cranelift actually supports and avoids hard-to-diagnose verifier failures later in the pipeline.

Cranelift sqmul_round_sat.{i16,i32} Does Not Exist for Scalars: Why the Docs Mislead and How to Fix Your IR