How to Fix: [wasi-sockets] Fresh UDP sockets occasionally return `ErrorCode::WouldBlock` on first send

6 min read

[wasi-sockets] Fresh UDP sockets occasionally return ErrorCode::WouldBlock on first send

A newly created UDP socket should feel ready immediately, but under WASI Preview 2 sockets that assumption can be wrong. The intermittent first-send failure with ErrorCode::WouldBlock is a readiness race: the socket exists, yet the host networking layer has not fully transitioned it into a writable state when the first datagram is attempted.

This matters for projects like mio, where tests often assume that a fresh datagram socket can send immediately after creation or connection. On traditional operating systems, that usually works. In WASIp2, however, the socket API is more explicit about asynchronous state transitions, so a first send can legitimately report WouldBlock until the runtime signals write readiness.

Understanding the Root Cause

The core issue is that socket creation, binding, and even connecting do not always imply immediate send readiness in the WASI sockets model. A host implementation may need an extra scheduling turn to finish internal setup before outbound datagrams are accepted. If application code calls send or send_to too early, the host can correctly respond with ErrorCode::WouldBlock.

Why does it look flaky instead of failing every time?

  • Timing-sensitive initialization: some runs allow the host to complete socket setup before the first send, others do not.
  • Event loop differences: test runners, CI machines, and local environments can produce different scheduling behavior.
  • Mismatch in assumptions: code ported from Unix-like networking often assumes “fresh socket means writable now,” but WASI readiness is more explicit.

In practice, this means the bug is not usually that UDP is broken. The bug is that the caller treats an initial WouldBlock as an unexpected hard failure instead of a normal non-blocking readiness signal.

For mio and similar libraries, the fix is to model a fresh WASI UDP socket like any other non-blocking I/O resource: wait for write readiness, then retry the send.

Step-by-Step Solution

The most reliable fix is to make the first UDP send resilient to an initial WouldBlock. That can be done in three layers: test logic, socket abstraction, and polling integration.

1. Treat WouldBlock as a retryable readiness condition

If your current code fails the test immediately on the first send error, change it so that ErrorCode::WouldBlock triggers a wait-for-writable path instead.

loop {
    match udp_socket.send(&buf) {
        Ok(n) => {
            assert!(n > 0);
            break;
        }
        Err(ErrorCode::WouldBlock) => {
            wait_until_writable(&udp_socket)?;
            continue;
        }
        Err(e) => return Err(e.into()),
    }
}

This is the single most important behavioral fix. The first send is no longer assumed to succeed synchronously.

2. Wait for write readiness through the poller

When integrating with mio-style polling, register the socket and wait for a writable event before retrying.

use mio::{Events, Interest, Poll, Token};

const UDP: Token = Token(0);

fn wait_until_writable(socket: &mut impl mio::event::Source) -> std::io::Result<()> {
    let mut poll = Poll::new()?;
    let mut events = Events::with_capacity(8);

    poll.registry().register(socket, UDP, Interest::WRITABLE)?;

    loop {
        poll.poll(&mut events, None)?;
        for event in events.iter() {
            if event.token() == UDP && event.is_writable() {
                return Ok(());
            }
        }
    }
}

In a real abstraction, you would usually reuse an existing poller instead of creating one per retry. The important detail is that the socket must be given a chance to become writable before retrying the datagram send.

3. Adjust UDP tests so they do not assume immediate first-send success

Many test failures happen because test code is stricter than the API contract. Instead of this:

socket.send(&payload).unwrap();

use a retry-aware helper:

fn send_with_retry(socket: &mut UdpSocket, payload: &[u8]) -> std::io::Result<usize> {
    loop {
        match socket.send(payload) {
            Ok(n) => return Ok(n),
            Err(ref e) if e.kind() == std::io::ErrorKind::WouldBlock => {
                wait_until_writable(socket)?;
            }
            Err(e) => return Err(e),
        }
    }
}

This keeps the test aligned with non-blocking socket semantics and removes timing-based flakes.

4. If wrapping WASI directly, preserve the non-blocking contract

If you are building a portability layer between WASIp2 and Rust networking traits, avoid converting initial WouldBlock into an unexpected fatal error. Instead, map it consistently to std::io::ErrorKind::WouldBlock.

fn map_wasi_error(err: wasi::sockets::network::ErrorCode) -> std::io::Error {
    match err {
        wasi::sockets::network::ErrorCode::WouldBlock => {
            std::io::Error::from(std::io::ErrorKind::WouldBlock)
        }
        other => std::io::Error::new(std::io::ErrorKind::Other, format!("{:?}", other)),
    }
}

This ensures upper layers such as mio, test harnesses, or async runtimes can apply normal retry behavior.

5. Prefer readiness-driven connection flow for connected UDP sockets

For connected UDP sockets, do not assume that immediately after connect the socket is writable on every host implementation. If a first send can race, gate it behind readiness just like any other non-blocking operation.

socket.connect(remote_addr)?;

loop {
    match socket.send(&payload) {
        Ok(_) => break,
        Err(ref e) if e.kind() == std::io::ErrorKind::WouldBlock => {
            wait_until_writable(&mut socket)?;
        }
        Err(e) => return Err(e),
    }
}

This makes the behavior robust across runtimes that schedule socket state transitions differently.

6. Keep retries bounded in tests if needed

If you want tests to fail fast when something is truly broken, use a timeout instead of infinite retries.

use std::time::{Duration, Instant};

fn send_with_deadline(socket: &mut UdpSocket, payload: &[u8]) -> std::io::Result<usize> {
    let deadline = Instant::now() + Duration::from_secs(2);

    loop {
        match socket.send(payload) {
            Ok(n) => return Ok(n),
            Err(ref e) if e.kind() == std::io::ErrorKind::WouldBlock => {
                if Instant::now() >= deadline {
                    return Err(std::io::Error::new(
                        std::io::ErrorKind::TimedOut,
                        "UDP socket never became writable"
                    ));
                }
                wait_until_writable(socket)?;
            }
            Err(e) => return Err(e),
        }
    }
}

This distinguishes a normal transient readiness delay from a real regression.

Common Edge Cases

  • Socket registered too late: if the send fails with WouldBlock but the socket is not yet registered for writable interest, your event loop may never wake correctly.
  • Assuming bind implies readiness: a successful bind only means the local address is assigned. It does not guarantee immediate send readiness under every host implementation.
  • Connected vs unconnected UDP confusion: send on a connected socket and send_to on an unconnected socket can still share the same readiness problem.
  • Incorrect error mapping: if WASI ErrorCode::WouldBlock is mapped to a generic I/O error, higher layers cannot perform the expected retry behavior.
  • Busy-loop retries: retrying immediately without waiting for a writable event can waste CPU and still produce flaky tests.
  • Dropping first readiness event: some wrappers accidentally consume or ignore the writable notification, causing the retry loop to stall.
  • CI-only failures: slower or more contended environments often expose the race more frequently than local development machines.

FAQ

Is returning WouldBlock on the first UDP send actually valid?

Yes. In a non-blocking model, WouldBlock means the operation cannot complete yet without waiting. Under WASI sockets, a fresh socket may still need to become writable before the first datagram send succeeds.

Why does this happen more often in tests than in production?

Tests often perform socket creation and first send back-to-back with no natural delay. That makes readiness races easier to trigger. Production systems may incidentally include enough scheduling gaps that the socket becomes writable before the first send attempt.

Should I add a sleep before the first send?

No. A fixed sleep can hide the bug temporarily but does not model the API correctly. The proper fix is to wait for writable readiness and retry when WouldBlock occurs.

The durable solution for this GitHub issue is simple: treat a fresh WASI UDP socket as potentially not yet writable, handle ErrorCode::WouldBlock as a normal readiness signal, and retry the first send only after the socket reports WRITABLE. That turns a flaky timing bug into correct non-blocking behavior.

Leave a Reply

Your email address will not be published. Required fields are marked *