How to Fix: Execution results of given wasm file different from other runtime tools

7 min read

A mutated WebAssembly binary producing different results across runtimes is usually not a random execution glitch. It is a signal that the module has crossed into an area where validation rules, undefined behavior, implementation bugs, or engine-specific recovery paths diverge. When one runtime accepts and executes a malformed or edge-case .wasm file while others trap, reject, or compute different values, the real problem is almost always in how the binary is parsed, validated, or how out-of-spec instructions are handled after mutation.

Problem Overview

The issue describes a wasm test case created by mutating a previously valid binary. That detail matters. Once a valid module is mutated at the binary level, several things can happen:

  • The binary becomes formally invalid but is still partially accepted by a permissive runtime.
  • The mutation preserves binary structure but changes instruction semantics, causing traps or non-deterministic behavior.
  • The mutated module triggers a runtime bug in one engine but not in others.
  • The binary relies on behavior that is implementation-defined, unsupported, or proposal-specific.

If your runtime produces results different from tools such as Wasmtime, Wasmer, WAMR, Node.js, or browser engines, do not assume the other tools are wrong. The first task is to determine whether the wasm file is still valid according to the spec and whether all compared runtimes support the same feature set.

Understanding the Root Cause

The technical root cause usually falls into one of four buckets.

1. The mutated binary violates WebAssembly validation rules

WebAssembly runtimes are expected to reject invalid binaries during decode or validation. However, different runtimes may:

  • Reject immediately with a validation error
  • Accept due to a parser bug
  • Accept because a feature flag changes interpretation
  • Mis-handle malformed sections and continue execution

This is especially common when mutation changes:

  • Section lengths
  • Type signatures
  • Block stack discipline
  • Memory alignment immediates
  • Branch targets
  • Function body size metadata

2. The wasm file triggers undefined or trap-prone execution paths

Even if the binary validates, mutation may produce code that:

  • Reads uninitialized logical state in the source model
  • Causes integer division traps
  • Performs out-of-bounds memory access
  • Uses NaN payloads whose exact bit patterns can differ in some environments
  • Depends on host imports with mismatched signatures

In these cases, one runtime may trap while another appears to continue if it contains a bug in bounds checking or stack handling.

3. Feature mismatches across runtimes

Many runtime discrepancies are caused by comparing engines with different support for:

  • Reference types
  • SIMD
  • Multi-value
  • Bulk memory
  • Sign extension ops
  • Non-trapping float-to-int conversions

If the mutated file accidentally introduces opcodes from a proposal, one runtime may decode them correctly while another treats them as invalid.

4. A genuine runtime bug

If the module is valid, uses only supported features, and still behaves differently, you may have found a bug in:

  • Decoder logic
  • Validator implementation
  • Interpreter execution
  • JIT compilation
  • Memory model enforcement

This is particularly likely when a fuzzed or mutated module passes validation in multiple independent tools but only one runtime computes the wrong result.

How to Reproduce and Compare Runtimes

Before fixing anything, build a reproducible comparison matrix. The goal is to classify the module as invalid, unsupported, or incorrectly executed.

  1. Validate the binary with at least two independent validators.
  2. Convert the binary to text to inspect suspicious instructions.
  3. Run the same exported function with identical inputs across runtimes.
  4. Record whether each runtime rejects, traps, or returns a value.
  5. If possible, disable JIT and run in interpreter mode to separate compiler bugs from semantic bugs.

Useful comparison targets include tools from the WebAssembly Binary Toolkit and mainstream runtimes. Always compare versions explicitly because validation behavior changes over time.

Step-by-Step Solution

The safest fix is to make your runtime follow a strict pipeline: decode – validate – instantiate – execute, with hard failure on malformed binaries. Do not continue executing a module that fails validation, even if parts of it appear structurally readable.

Step 1: Validate the wasm binary before execution

Use external tooling first. If the module fails here, your runtime should also reject it.

wasm-validate mutated.wasm
wasm2wat mutated.wasm > mutated.wat

If wasm-validate reports an error, the discrepancy is likely caused by your runtime being too permissive.

Step 2: Compare behavior across runtimes

wasmtime mutated.wasm
wasmer run mutated.wasm
node run_wasm.js
iwasm mutated.wasm

Create a table internally like this:

  • Runtime A: validation error
  • Runtime B: trap at function index N
  • Runtime C: returns incorrect value

If your runtime is the only one returning a value for an invalid module, the bug is likely in your binary parser or validator.

Step 3: Inspect the disassembly for malformed structure

After converting to WAT, inspect:

  • Function signatures
  • Block nesting
  • Memory load/store offsets
  • Branch labels
  • Call indices
  • Locals declarations
wasm-objdump -x mutated.wasm
wasm-objdump -d mutated.wasm

Look for impossible or suspicious patterns, such as mismatched result arity, unreachable fallthrough with invalid stack state, or opcodes not enabled by your runtime configuration.

Step 4: Enforce strict validation in your runtime

If you maintain the runtime, the fix usually looks like this at a high level:

module = decode_wasm(bytes)
if module.decode_error:
    return ERR_MALFORMED_BINARY

validation_result = validate_module(module, enabled_features)
if !validation_result.ok:
    return ERR_VALIDATION_FAILED

instance = instantiate(module, imports)
if !instance.ok:
    return ERR_INSTANTIATION_FAILED

return execute(instance, entry, args)

Important enforcement rules:

  • Reject unknown or disabled opcodes.
  • Verify all section lengths and function body boundaries.
  • Check type stack consistency for every instruction path.
  • Reject invalid branch depths and call targets.
  • Enforce memory/table limits before instantiation.
  • Do not silently coerce malformed immediates.

Step 5: Test with spec tests and fuzz regressions

Once patched, add both the reported binary and reduced reproductions to your regression suite.

# Example regression layout
/tests/regressions/mutated-invalid-01.wasm
/tests/regressions/mutated-invalid-01.expect
/tests/regressions/mutated-valid-diff-runtime-01.wasm

Also run:

  • Official WebAssembly spec tests
  • Negative validation tests
  • Fuzz-generated corpus

This prevents future decoder or validator changes from reintroducing the bug.

Step 6: Reduce the failing wasm file

If the original archive contains a large mutated file, minimize it until only the problematic instruction sequence remains. This makes it far easier to verify whether the issue is in validation or execution.

# Conceptual workflow
1. Remove unused exports
2. Remove unrelated functions
3. Keep one memory/table if needed
4. Re-run validation and execution after each reduction
5. Stop when the smallest reproducer still differs across runtimes

A minimized reproducer is the most useful artifact for maintainers and often reveals the exact opcode or section causing divergence.

Common Edge Cases

Malformed but partially decodable binaries

Some runtimes incorrectly keep parsing after a bad section size or malformed LEB128 field. This can produce wildly different control flow from other tools.

NaN behavior in floating-point operations

If the mutated module heavily uses floating-point math, do not compare only decimal output. Compare trap behavior and, if necessary, raw bit patterns. NaN canonicalization can make outputs look different even when the runtime is not fully broken.

Different feature flags in different tools

A module using SIMD or reference types may be valid in one runtime and invalid in another if flags differ. Always align enabled features before concluding there is a semantic bug.

Import mismatches

If the wasm file expects host functions, memories, or globals, a runtime may instantiate it with subtly different host definitions. That can look like an execution bug when the real issue is an import signature mismatch.

Interpreter versus JIT-only failures

If a module succeeds in interpreter mode but fails under JIT, the root cause is usually code generation, register allocation, or optimization around traps and stack values.

Corrupted custom sections confusing tooling

Custom sections should be ignorable, but some tools surface odd diagnostics when they are malformed. Make sure the discrepancy is in executable semantics, not just debug metadata parsing.

FAQ

Why does one runtime execute the mutated wasm file while others reject it?

That usually means the runtime is accepting a binary that should fail WebAssembly validation. The parser or validator may be too permissive, or a feature gate may be incorrectly enabled.

How can I tell whether this is an invalid wasm file or a real runtime bug?

Run the file through independent validators such as tools from WABT and compare multiple mature runtimes. If the module validates everywhere and only one runtime produces the wrong result, it is likely a real runtime bug.

Should I fix this in the executor or the validator?

Start with the validator. Invalid modules should never reach execution. If the module is valid and still misbehaves, then inspect the interpreter or JIT backend.

Practical Resolution Summary

To solve this class of issue, treat the mutated wasm file as potentially malformed until proven otherwise. Validate it with independent tools, align runtime feature flags, reduce the reproducer, and make your runtime reject invalid binaries before instantiation. If the file is valid and execution still differs, the bug is most likely in your runtime’s validation semantics, trap handling, or code generation. That is the path maintainers should focus on when addressing execution results that differ from other WebAssembly runtime tools.

Leave a Reply

Your email address will not be published. Required fields are marked *