How to Fix: Execution results of given wasm file different from other runtime tools
A mutated WebAssembly binary producing different results across runtimes is usually not a random execution glitch. It is a signal that the module has crossed into an area where validation rules, undefined behavior, implementation bugs, or engine-specific recovery paths diverge. When one runtime accepts and executes a malformed or edge-case .wasm file while others trap, reject, or compute different values, the real problem is almost always in how the binary is parsed, validated, or how out-of-spec instructions are handled after mutation.
Table of Contents
Problem Overview
The issue describes a wasm test case created by mutating a previously valid binary. That detail matters. Once a valid module is mutated at the binary level, several things can happen:
- The binary becomes formally invalid but is still partially accepted by a permissive runtime.
- The mutation preserves binary structure but changes instruction semantics, causing traps or non-deterministic behavior.
- The mutated module triggers a runtime bug in one engine but not in others.
- The binary relies on behavior that is implementation-defined, unsupported, or proposal-specific.
If your runtime produces results different from tools such as Wasmtime, Wasmer, WAMR, Node.js, or browser engines, do not assume the other tools are wrong. The first task is to determine whether the wasm file is still valid according to the spec and whether all compared runtimes support the same feature set.
Understanding the Root Cause
The technical root cause usually falls into one of four buckets.
1. The mutated binary violates WebAssembly validation rules
WebAssembly runtimes are expected to reject invalid binaries during decode or validation. However, different runtimes may:
- Reject immediately with a validation error
- Accept due to a parser bug
- Accept because a feature flag changes interpretation
- Mis-handle malformed sections and continue execution
This is especially common when mutation changes:
- Section lengths
- Type signatures
- Block stack discipline
- Memory alignment immediates
- Branch targets
- Function body size metadata
2. The wasm file triggers undefined or trap-prone execution paths
Even if the binary validates, mutation may produce code that:
- Reads uninitialized logical state in the source model
- Causes integer division traps
- Performs out-of-bounds memory access
- Uses NaN payloads whose exact bit patterns can differ in some environments
- Depends on host imports with mismatched signatures
In these cases, one runtime may trap while another appears to continue if it contains a bug in bounds checking or stack handling.
3. Feature mismatches across runtimes
Many runtime discrepancies are caused by comparing engines with different support for:
- Reference types
- SIMD
- Multi-value
- Bulk memory
- Sign extension ops
- Non-trapping float-to-int conversions
If the mutated file accidentally introduces opcodes from a proposal, one runtime may decode them correctly while another treats them as invalid.
4. A genuine runtime bug
If the module is valid, uses only supported features, and still behaves differently, you may have found a bug in:
- Decoder logic
- Validator implementation
- Interpreter execution
- JIT compilation
- Memory model enforcement
This is particularly likely when a fuzzed or mutated module passes validation in multiple independent tools but only one runtime computes the wrong result.
How to Reproduce and Compare Runtimes
Before fixing anything, build a reproducible comparison matrix. The goal is to classify the module as invalid, unsupported, or incorrectly executed.
- Validate the binary with at least two independent validators.
- Convert the binary to text to inspect suspicious instructions.
- Run the same exported function with identical inputs across runtimes.
- Record whether each runtime rejects, traps, or returns a value.
- If possible, disable JIT and run in interpreter mode to separate compiler bugs from semantic bugs.
Useful comparison targets include tools from the WebAssembly Binary Toolkit and mainstream runtimes. Always compare versions explicitly because validation behavior changes over time.
Step-by-Step Solution
The safest fix is to make your runtime follow a strict pipeline: decode – validate – instantiate – execute, with hard failure on malformed binaries. Do not continue executing a module that fails validation, even if parts of it appear structurally readable.
Step 1: Validate the wasm binary before execution
Use external tooling first. If the module fails here, your runtime should also reject it.
wasm-validate mutated.wasm
wasm2wat mutated.wasm > mutated.wat
If wasm-validate reports an error, the discrepancy is likely caused by your runtime being too permissive.
Step 2: Compare behavior across runtimes
wasmtime mutated.wasm
wasmer run mutated.wasm
node run_wasm.js
iwasm mutated.wasm
Create a table internally like this:
- Runtime A: validation error
- Runtime B: trap at function index N
- Runtime C: returns incorrect value
If your runtime is the only one returning a value for an invalid module, the bug is likely in your binary parser or validator.
Step 3: Inspect the disassembly for malformed structure
After converting to WAT, inspect:
- Function signatures
- Block nesting
- Memory load/store offsets
- Branch labels
- Call indices
- Locals declarations
wasm-objdump -x mutated.wasm
wasm-objdump -d mutated.wasm
Look for impossible or suspicious patterns, such as mismatched result arity, unreachable fallthrough with invalid stack state, or opcodes not enabled by your runtime configuration.
Step 4: Enforce strict validation in your runtime
If you maintain the runtime, the fix usually looks like this at a high level:
module = decode_wasm(bytes)
if module.decode_error:
return ERR_MALFORMED_BINARY
validation_result = validate_module(module, enabled_features)
if !validation_result.ok:
return ERR_VALIDATION_FAILED
instance = instantiate(module, imports)
if !instance.ok:
return ERR_INSTANTIATION_FAILED
return execute(instance, entry, args)
Important enforcement rules:
- Reject unknown or disabled opcodes.
- Verify all section lengths and function body boundaries.
- Check type stack consistency for every instruction path.
- Reject invalid branch depths and call targets.
- Enforce memory/table limits before instantiation.
- Do not silently coerce malformed immediates.
Step 5: Test with spec tests and fuzz regressions
Once patched, add both the reported binary and reduced reproductions to your regression suite.
# Example regression layout
/tests/regressions/mutated-invalid-01.wasm
/tests/regressions/mutated-invalid-01.expect
/tests/regressions/mutated-valid-diff-runtime-01.wasm
Also run:
- Official WebAssembly spec tests
- Negative validation tests
- Fuzz-generated corpus
This prevents future decoder or validator changes from reintroducing the bug.
Step 6: Reduce the failing wasm file
If the original archive contains a large mutated file, minimize it until only the problematic instruction sequence remains. This makes it far easier to verify whether the issue is in validation or execution.
# Conceptual workflow
1. Remove unused exports
2. Remove unrelated functions
3. Keep one memory/table if needed
4. Re-run validation and execution after each reduction
5. Stop when the smallest reproducer still differs across runtimes
A minimized reproducer is the most useful artifact for maintainers and often reveals the exact opcode or section causing divergence.
Common Edge Cases
Malformed but partially decodable binaries
Some runtimes incorrectly keep parsing after a bad section size or malformed LEB128 field. This can produce wildly different control flow from other tools.
NaN behavior in floating-point operations
If the mutated module heavily uses floating-point math, do not compare only decimal output. Compare trap behavior and, if necessary, raw bit patterns. NaN canonicalization can make outputs look different even when the runtime is not fully broken.
Different feature flags in different tools
A module using SIMD or reference types may be valid in one runtime and invalid in another if flags differ. Always align enabled features before concluding there is a semantic bug.
Import mismatches
If the wasm file expects host functions, memories, or globals, a runtime may instantiate it with subtly different host definitions. That can look like an execution bug when the real issue is an import signature mismatch.
Interpreter versus JIT-only failures
If a module succeeds in interpreter mode but fails under JIT, the root cause is usually code generation, register allocation, or optimization around traps and stack values.
Corrupted custom sections confusing tooling
Custom sections should be ignorable, but some tools surface odd diagnostics when they are malformed. Make sure the discrepancy is in executable semantics, not just debug metadata parsing.
FAQ
Why does one runtime execute the mutated wasm file while others reject it?
That usually means the runtime is accepting a binary that should fail WebAssembly validation. The parser or validator may be too permissive, or a feature gate may be incorrectly enabled.
How can I tell whether this is an invalid wasm file or a real runtime bug?
Run the file through independent validators such as tools from WABT and compare multiple mature runtimes. If the module validates everywhere and only one runtime produces the wrong result, it is likely a real runtime bug.
Should I fix this in the executor or the validator?
Start with the validator. Invalid modules should never reach execution. If the module is valid and still misbehaves, then inspect the interpreter or JIT backend.
Practical Resolution Summary
To solve this class of issue, treat the mutated wasm file as potentially malformed until proven otherwise. Validate it with independent tools, align runtime feature flags, reduce the reproducer, and make your runtime reject invalid binaries before instantiation. If the file is valid and execution still differs, the bug is most likely in your runtime’s validation semantics, trap handling, or code generation. That is the path maintainers should focus on when addressing execution results that differ from other WebAssembly runtime tools.