How to Fix: fd_renumber difference.

8 min read

Why fd_renumber behaves differently in WASI: root cause, reproduction, and the fix

A failing fd_renumber test usually means your runtime and your expectations disagree about one subtle detail: whether renumbering a file descriptor should behave like a pure move, a dup-like replacement, or a close-and-rebind operation when preopened descriptors, stdio, and existing targets are involved. That difference is exactly why the same C test can pass in one WASI implementation and fail in another.

The problem in this test case

The issue titled fd_renumber difference points to a WASI behavior mismatch around descriptor reassignment. In practical terms, the test opens one or more files, calls fd_renumber, and then observes that the resulting descriptor table does not match expected Unix-like semantics.

This commonly shows up when code assumes the target descriptor becomes an alias of the source descriptor while preserving expected read/write behavior, offset state, rights, or close semantics. In WASI, however, fd_renumber is defined at the API layer, and runtimes may differ in how faithfully they map that behavior onto the host operating system.

If your test uses headers such as wasi/api.h, fcntl.h, and unistd.h, the most likely scenario is:

  • A file is opened and assigned a descriptor.
  • Another descriptor already exists at the target number.
  • fd_renumber(from, to) is invoked.
  • The program then checks whether from is invalid, whether to now refers to the original file, and whether reads, writes, or closes behave consistently afterward.

That final step is where implementation differences become visible.

Understanding the Root Cause

The root cause is that WASI file descriptor tables are not just thin wrappers over POSIX descriptors. A WASI runtime typically maintains its own internal descriptor table containing:

  • Rights and capability metadata
  • File type information
  • Preopen state
  • Host descriptor mappings
  • Runtime-managed ownership and lifecycle rules

Because of that, fd_renumber is not always equivalent to calling dup2 on the host.

Technically, the difference happens for a few reasons:

  1. Source invalidation semantics: After a successful fd_renumber, the source descriptor is expected to be removed from the WASI table. Some buggy implementations accidentally leave the source usable or partially alive.
  2. Target replacement semantics: If the target descriptor already exists, it must be atomically replaced. Some runtimes close the target too early, fail to transfer metadata correctly, or mishandle internal references.
  3. Preopened descriptors: WASI preopens are special. Renumbering onto or from a preopen can expose bugs because preopens carry directory mapping metadata that ordinary files do not.
  4. Rights propagation: The descriptor at the new number must preserve the original source descriptor’s rights and capabilities. If the runtime reconstructs the target from the host descriptor instead of moving the WASI entry, it can accidentally alter rights.
  5. Shared offset and state: If the implementation models renumbering incorrectly, file offsets or append state may diverge from what callers expect after the move.

In short, this bug happens when the runtime treats fd_renumber as a host-level descriptor trick instead of a WASI table mutation with strict capability-preserving semantics.

Step-by-Step Solution

The fix is to implement fd_renumber as an atomic move inside the runtime’s descriptor table, not as an ad hoc sequence of host syscalls.

Expected behavior for fd_renumber(from, to) should be:

  1. Validate that from exists and is usable.
  2. If to exists, close and remove the existing WASI entry for to.
  3. Move the full WASI descriptor object from from to to.
  4. Invalidate from.
  5. Preserve all rights, file type, flags, offset-sharing semantics, and attached metadata.

Use the following workflow to diagnose and fix the issue.

1. Reproduce the problem with a minimal test

#include <stdio.h>
#include <stdlib.h>
#include <wasi/api.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>

static void die(const char *msg) {
    perror(msg);
    exit(1);
}

int main() {
    int fd1 = open("file1.txt", O_CREAT | O_RDWR, 0644);
    if (fd1 < 0) die("open file1");

    int fd2 = open("file2.txt", O_CREAT | O_RDWR, 0644);
    if (fd2 < 0) die("open file2");

    if (write(fd1, "A", 1) != 1) die("write fd1");
    if (write(fd2, "B", 1) != 1) die("write fd2");

    __wasi_errno_t err = __wasi_fd_renumber(fd1, fd2);
    if (err != __WASI_ERRNO_SUCCESS) {
        fprintf(stderr, "fd_renumber failed: %d\n", err);
        return 1;
    }

    if (write(fd2, "C", 1) != 1) die("write fd2 after renumber");

    if (write(fd1, "X", 1) != -1) {
        fprintf(stderr, "BUG: source fd still usable after renumber\n");
        return 1;
    }

    close(fd2);
    return 0;
}

This test verifies the two most important properties:

  • fd2 should now represent what fd1 used to represent.
  • fd1 should no longer be valid.

2. Audit the runtime implementation

If you maintain the WASI runtime, inspect the fd_renumber implementation and look for code shaped like this:

// Problematic pattern
close(target_host_fd);
dup2(source_host_fd, target_host_fd);
// ...but source remains in the runtime table
// ...or metadata is rebuilt instead of moved

This pattern is usually the source of the mismatch. It updates host descriptors but fails to update the runtime’s own descriptor table correctly.

The correct approach is closer to:

// Better conceptual model
validate(from);
if (exists(to)) {
    close_wasi_entry(to);
}
move_descriptor_entry(from, to);
remove_entry(from);

3. Preserve the full descriptor object

Do not copy only the host file handle. Move the full internal descriptor structure, including:

  • Access rights
  • Inheriting rights
  • File descriptor flags
  • Preopen metadata
  • Directory state
  • Socket or file type markers

If your implementation reconstructs a new entry for to from a host handle, the bug may survive in more subtle forms even after simple tests pass.

4. Ensure atomic table replacement

If the runtime is multithreaded or async-aware, protect the renumber operation with the same lock used for descriptor table mutations.

lock(fd_table);
validate(from);
if (exists(to)) {
    close_entry(to);
}
fd_table[to] = fd_table[from];
erase(fd_table, from);
unlock(fd_table);

This prevents transient states where:

  • Both descriptors appear valid
  • Neither descriptor appears valid
  • A concurrent operation closes the wrong underlying resource

5. Add conformance tests

To avoid regressions, add tests for these exact behaviors:

// Assertions to add in your test suite
// 1. source becomes invalid
// 2. target now refers to the old source resource
// 3. rights are preserved
// 4. existing target is replaced
// 5. closing target closes the moved resource only once

Also test renumbering across:

  • Regular files
  • Directories
  • Preopened directories
  • stdio descriptors when allowed by the runtime

6. Application-level workaround

If you are consuming a runtime you do not control and need a temporary workaround, avoid depending on fd_renumber for critical logic. Instead:

  • Open resources in the exact descriptor order you need when possible
  • Use runtime-specific configuration for stdio mapping
  • Avoid renumbering preopened descriptors
  • Test explicitly on each target runtime

Example defensive wrapper:

__wasi_errno_t safe_renumber(__wasi_fd_t from, __wasi_fd_t to) {
    if (from == to) {
        return __WASI_ERRNO_SUCCESS;
    }
    return __wasi_fd_renumber(from, to);
}

This does not fix a broken runtime, but it avoids one common edge condition and makes failures easier to isolate.

Common Edge Cases

Even after fixing the main bug, several edge cases can still break fd_renumber behavior.

Renumbering onto the same descriptor

If from == to, the runtime should typically treat the operation as a no-op success. Incorrect implementations may close the descriptor or accidentally invalidate it.

Renumbering a preopened directory

Preopen descriptors often include path mapping metadata used for sandboxed filesystem access. If that metadata is lost during renumbering, later path-based calls may fail even though the descriptor itself seems valid.

Renumbering over stdio

Moving a file descriptor onto 0, 1, or 2 can expose assumptions in the runtime about standard input, output, or error streams. Some runtimes special-case these descriptors and may not fully support replacement.

Double-close bugs

If the old target entry is closed and the moved source entry still points to the same underlying host object incorrectly, later cleanup can trigger a second close. This may surface as intermittent I/O failures or unrelated descriptor corruption.

Rights mismatch after renumber

If the runtime reconstructs the target entry instead of moving it, capability rights may change unexpectedly. A descriptor that previously supported writes may start failing with permission-related errors.

Offset inconsistencies

Some tests reveal that the file offset after renumbering is wrong. This usually means the implementation duplicated the underlying handle in a way that changed state sharing rather than performing a true table move.

FAQ

Why does fd_renumber differ from POSIX dup2?

Because WASI descriptors are capability-based runtime objects, not just raw OS file descriptors. The runtime must preserve rights, metadata, and sandboxing semantics in addition to replacing the descriptor number.

Should the source descriptor still work after fd_renumber?

No. After a successful fd_renumber(from, to), the source descriptor should be invalidated. If reads or writes still succeed on from, that usually indicates a runtime bug.

Can this bug affect only some WASI runtimes?

Yes. This is often an implementation-specific conformance issue. One runtime may correctly move the WASI table entry, while another may emulate the operation with host syscalls and accidentally preserve incorrect state.

Final takeaway

The reliable fix for fd_renumber difference is to treat renumbering as a WASI descriptor table move, not a best-effort host-level duplication. Once the runtime atomically replaces the target entry, preserves the full source metadata, and invalidates the source descriptor, the test behavior becomes consistent and matches caller expectations.

If you are debugging this in a runtime codebase, start by auditing how fd_renumber updates internal descriptor ownership. That is almost always where the real bug lives.

Leave a Reply

Your email address will not be published. Required fields are marked *