How to Fix: fuzz: different results for `shr_s`
When i32.shr_s returns different results across engines, the bug is usually not the shift itself—it is a mismatch in how the shift count is normalized before execution.
Problem Summary
The issue titled fuzz: different results for shr_s points to a classic WebAssembly implementation inconsistency: two runtimes evaluate the same signed right shift instruction and produce different outputs. In the provided test case, the core operation is i32.shr_s, which performs an arithmetic right shift on a 32-bit integer.
For WebAssembly, this instruction has precise semantics. The left operand is interpreted as a signed 32-bit value, and the right operand is the shift amount. Critically, the shift amount is not used directly. It must be reduced modulo the bit width. For i32, that means the engine must use only the lower 5 bits of the shift count, equivalent to shift & 31.
If one implementation masks the shift count correctly and another does not, fuzzing will expose mismatched behavior immediately.
Understanding the Root Cause
The root cause is almost always one of these:
- the runtime fails to apply the correct WebAssembly shift-count masking rule,
- the runtime performs the operation using the host language incorrectly,
- or sign extension is handled incorrectly during lowering, optimization, or interpretation.
According to WebAssembly semantics:
- i32.shr_s is a signed arithmetic right shift,
- the value being shifted must preserve its sign bit,
- the shift amount must be normalized as rhs & 31.
That means these two are equivalent:
i32.shr_s(x, y)
((int32_t)x) >> (y & 31)
If an engine instead does something like this:
((int32_t)x) >> y
then behavior may diverge when y >= 32, when y is derived from fuzzed input, or when the host language treats overshifts differently. In C and C++, shifting by an amount greater than or equal to the bit width is problematic and may lead to undefined or implementation-specific behavior. In JITs and interpreters, the mismatch can also appear if one path masks and another path does not.
Another subtle source of inconsistency is using a logical shift instead of an arithmetic shift. For negative numbers, these are very different:
- Arithmetic shift replicates the sign bit.
- Logical shift inserts zeros.
So if the input is negative and the engine uses unsigned lowering by mistake, the output will differ even when the shift count is valid.
Step-by-Step Solution
The fix is to enforce WebAssembly semantics explicitly in every execution path: interpreter, baseline compiler, optimizing compiler, and any constant-folding pass.
1. Normalize the shift count
For i32.shr_s, always mask with 31.
uint32_t raw_shift = pop_i32_operand();
int32_t value = pop_i32_operand();
uint32_t shift = raw_shift & 31;
int32_t result = value >> shift;
push_i32_result(result);
If your runtime supports i64.shr_s, use 63 instead.
uint64_t shift = raw_shift & 63;
2. Preserve signedness correctly
Make sure the value operand is treated as a signed 32-bit integer before shifting.
int32_t value = (int32_t)input;
uint32_t shift = count & 31;
int32_t result = value >> shift;
Avoid patterns that silently convert to unsigned:
uint32_t value = input;
uint32_t result = value >> (count & 31);
The example above implements logical right shift, not signed right shift.
3. Align interpreter and JIT behavior
If your project has multiple execution tiers, verify that they all use the same rule. A common failure mode is:
- the interpreter masks the shift count,
- the optimizing compiler assumes the backend instruction already does it,
- the backend on one architecture behaves differently.
Use a shared helper where possible:
static inline int32_t wasm_i32_shr_s(int32_t lhs, uint32_t rhs) {
return lhs >> (rhs & 31);
}
4. Fix constant folding and IR transforms
If the engine folds constants during validation or optimization, the same masking rule must apply there too.
int32_t fold_i32_shr_s(int32_t lhs, uint32_t rhs) {
return lhs >> (rhs & 31);
}
Without this, compiled code and interpreted code may disagree even after the runtime fix.
5. Add a regression test from the fuzz case
Turn the minimized fuzzing repro into a deterministic test. For example:
(module
(func (export "shr") (param i32 i32) (result i32)
local.get 0
local.get 1
i32.shr_s))
Then validate edge inputs:
shr(-1, 0) == -1
shr(-1, 1) == -1
shr(-2, 1) == -1
shr(1, 32) == 1
shr(1, 33) == 0
shr(-8, 35) == -1
Why these matter:
- 32 should behave like 0 for i32 shift counts.
- 33 should behave like 1.
- negative values confirm sign extension is preserved.
6. Compare against the spec behavior in a reference implementation
If the bug appears only in one backend, compare outputs against a trusted engine or a small reference evaluator. This helps isolate whether the issue is in parsing, execution, optimization, or code generation.
assert(exec_shr_s(1, 32) == 1);
assert(exec_shr_s(1, 33) == 0);
assert(exec_shr_s(-1, 5) == -1);
Common Edge Cases
- Shift counts greater than 31: These must wrap via masking, not trap and not shift literally.
- Negative source values: If sign extension is broken, results differ immediately.
- Unsigned lowering in host code: Using uint32_t for the left operand changes semantics from arithmetic to logical shift.
- Different behavior across architectures: Some CPUs implicitly mask shift counts, others require explicit handling in generated code or IR.
- Constant folding bugs: The interpreter may be correct while optimized code is wrong because folded expressions skip masking.
- i64 copy-paste mistakes: Reusing the i32 implementation for i64 without changing 31 to 63 causes another class of miscompilations.
FAQ
Why does fuzzing find this bug so often?
Because fuzzers generate extreme shift counts like 32, 33, or large random integers that normal application code may rarely use. Those values quickly reveal whether the engine correctly applies rhs & 31.
Is this a parsing bug or an execution bug?
Usually it is an execution or lowering bug, not a parsing issue. The WAT or Wasm module is typically valid. The mismatch happens when the runtime evaluates i32.shr_s incorrectly.
How do I confirm the fix is complete?
Test all execution paths: interpreter, baseline JIT, optimizing JIT, and constant-folding passes. Then add regression tests for values around 0, 1, 31, 32, and 33, using both positive and negative inputs.
The reliable fix for this GitHub issue is simple but non-negotiable: implement i32.shr_s exactly as a signed arithmetic right shift with the count masked by 31 everywhere in the engine. Once that rule is enforced consistently, the fuzzing discrepancy disappears.