How to Fix: File fd_pwrite bug
File fd_pwrite bug: why positioned writes fail and how to fix them safely
A broken fd_pwrite path usually shows up when a write that should land at a specific offset either writes to the wrong location, advances the file cursor unexpectedly, or behaves differently from native pwrite(). That is a correctness bug, not just a performance issue, because pwrite is defined to write at an explicit offset without modifying the file descriptor’s current file position.
Table of Contents
Symptoms of the bug
If your C test opens a file and performs offset-based writes, the failure usually falls into one of these categories:
- Data is written at the current seek position instead of the requested offset.
- The file offset changes after the write, which should not happen with pwrite().
- Concurrent writes become corrupted because the implementation internally uses lseek() + write().
- Append mode behaves unexpectedly when O_APPEND and explicit offsets interact.
- Partial writes or EINTR handling are ignored, leaving the file only partly updated.
Understanding the Root Cause
The root cause is usually an incorrect implementation of fd_pwrite that treats it like a normal write(). A correct pwrite() implementation must satisfy two important guarantees:
- It writes to the exact offset provided by the caller.
- It does not modify the shared file position associated with the file descriptor.
The most common bug is implementing fd_pwrite with this pattern:
lseek(fd, offset, SEEK_SET);
write(fd, buf, count);
That looks reasonable at first, but it is wrong for several reasons:
- Race condition: if multiple threads or processes share the same file descriptor, another operation can run between lseek() and write().
- File position corruption: lseek() changes the descriptor state, violating pwrite() semantics.
- Error handling mismatch: write() and pwrite() differ in how offset-based I/O is expected to behave.
Another frequent cause is an abstraction-layer bug where the internal file object ignores the explicit offset parameter and forwards the operation into the normal sequential write path. In that case, the test case fails even in single-threaded use because the wrong underlying API is called.
If this bug appears in a runtime, userspace filesystem, compatibility layer, or virtual file descriptor wrapper, the implementation may also be missing checks for:
- Negative offsets
- Non-seekable descriptors such as pipes, sockets, or some devices
- 64-bit offset truncation on large files
- Short writes that must be returned accurately
Step-by-Step Solution
The fix is to ensure that fd_pwrite uses a true positioned-write path all the way down, never a simulated seek + write fallback unless you can guarantee exclusive access and identical semantics. In most systems, the correct answer is to call the platform’s native pwrite() or equivalent backend operation.
1. Verify the broken behavior with a minimal test
Use a test that proves both the write offset and the current file position behavior:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
int main(void) {
const char *file = "test.bin";
int fd = open(file, O_CREAT | O_TRUNC | O_RDWR, 0644);
if (fd < 0) {
perror("open");
return 1;
}
if (write(fd, "AAAAAA", 6) != 6) {
perror("write");
close(fd);
return 1;
}
off_t before = lseek(fd, 0, SEEK_CUR);
if (before == (off_t)-1) {
perror("lseek before");
close(fd);
return 1;
}
ssize_t n = pwrite(fd, "BB", 2, 2);
if (n != 2) {
perror("pwrite");
close(fd);
return 1;
}
off_t after = lseek(fd, 0, SEEK_CUR);
if (after == (off_t)-1) {
perror("lseek after");
close(fd);
return 1;
}
char buf[7] = {0};
if (pread(fd, buf, 6, 0) != 6) {
perror("pread");
close(fd);
return 1;
}
printf("before=%lld after=%lld data=%s\n",
(long long)before,
(long long)after,
buf);
close(fd);
return 0;
}
Expected result:
- File contents become AABBAA
- before and after remain equal
If after changes, your fd_pwrite implementation is not preserving file position correctly.
2. Replace any seek-based emulation
If your current implementation looks like this, it is the bug:
ssize_t fd_pwrite(struct fd *f, const void *buf, size_t count, off_t offset) {
if (fd_lseek(f, offset, SEEK_SET) < 0)
return -1;
return fd_write(f, buf, count);
}
Replace it with a backend call that accepts an explicit offset:
ssize_t fd_pwrite(struct fd *f, const void *buf, size_t count, off_t offset) {
if (offset < 0) {
errno = EINVAL;
return -1;
}
if (!f || !f->ops || !f->ops->pwrite) {
errno = ESPIPE;
return -1;
}
return f->ops->pwrite(f, buf, count, offset);
}
This design keeps positioned I/O separate from sequential I/O and avoids mutating shared descriptor state.
3. Implement the backend correctly
If you maintain the file backend layer, wire fd_pwrite to the native system primitive:
ssize_t os_file_pwrite(struct fd *f, const void *buf, size_t count, off_t offset) {
int osfd = f->osfd;
ssize_t ret;
do {
ret = pwrite(osfd, buf, count, offset);
} while (ret < 0 && errno == EINTR);
return ret;
}
Key details:
- Retry on EINTR if your project’s conventions require it.
- Return short writes as-is.
- Do not update any cached seek pointer for the file descriptor.
4. Reject unsupported descriptor types explicitly
pwrite() is not valid for every kind of file descriptor. If your abstraction supports pipes, sockets, terminals, or special pseudo-files, reject unsupported cases cleanly:
ssize_t fd_pwrite(struct fd *f, const void *buf, size_t count, off_t offset) {
if (!f) {
errno = EBADF;
return -1;
}
if (offset < 0) {
errno = EINVAL;
return -1;
}
if (!(f->flags & FD_SEEKABLE)) {
errno = ESPIPE;
return -1;
}
if (!f->ops || !f->ops->pwrite) {
errno = ESPIPE;
return -1;
}
return f->ops->pwrite(f, buf, count, offset);
}
5. Protect against 32-bit and large-file bugs
If the issue appears only on larger files, your implementation may be truncating the offset. Make sure:
- off_t is used consistently
- No cast to int or long loses high bits
- Large file support flags are enabled where required
ssize_t fd_pwrite64(struct fd *f, const void *buf, size_t count, off_t offset) {
if (offset < 0) {
errno = EINVAL;
return -1;
}
return f->ops->pwrite(f, buf, count, offset);
}
6. Add regression tests
A solid fix is not complete until the test suite proves the semantics. Add tests for:
- Write at offset without changing current position
- Multiple pwrite calls to different offsets
- Interleaved write and pwrite calls
- Large offsets
- Unsupported descriptors returning ESPIPE
void test_pwrite_does_not_move_offset(int fd) {
char buf[6] = {0};
write(fd, "hello", 5);
off_t pos1 = lseek(fd, 0, SEEK_CUR);
pwrite(fd, "X", 1, 1);
off_t pos2 = lseek(fd, 0, SEEK_CUR);
pread(fd, buf, 5, 0);
if (pos1 != pos2) abort();
if (memcmp(buf, "hXllo", 5) != 0) abort();
}
7. If native pwrite is unavailable, document the limitation
On some constrained platforms, a true atomic positioned write may not exist. In that case, the safest approach is usually to:
- Fail explicitly for fd_pwrite rather than fake correct behavior
- Or implement strict synchronization around the emulation if your runtime fully owns the descriptor and can preserve semantics
Be careful: a fallback based on lseek + write + restore is still not equivalent in a shared or concurrent environment.
Common Edge Cases
- O_APPEND descriptors: some environments may prioritize append semantics in surprising ways. Validate behavior on your target platform and do not assume all wrappers preserve native rules automatically.
- Short writes: a successful pwrite() can return fewer bytes than requested. Your caller must handle this, especially on unusual filesystems or under resource pressure.
- Signals: if interrupted by a signal, the call may fail with EINTR. Decide whether your abstraction retries automatically or exposes that to callers.
- Sparse files: writing at a large offset may create holes. That is valid, but your tests should expect sparse-file behavior.
- Non-seekable files: pipes and sockets should not silently fall back to sequential writes. Return the correct error.
- Shared descriptors across threads: this is where broken emulation causes the most corruption. Regression tests should include concurrent access.
- Buffered wrapper layers: if a higher-level cache or buffer sits above the OS file descriptor, it must honor explicit offsets independently of its sequential cursor.
FAQ
Why is lseek() + write() not a valid replacement for pwrite()?
Because it changes the file descriptor’s current offset and introduces a race window between the seek and the write. pwrite() must target a specific offset without disturbing other operations that use the same descriptor.
Should fd_pwrite update the internal file position after writing?
No. A correct positioned write does not change the current file offset. If your abstraction stores a cursor, leave it unchanged for fd_pwrite.
What error should be returned for pipes or sockets?
Typically ESPIPE for non-seekable descriptors. The exact behavior should follow your platform or compatibility layer, but it should never silently behave like a normal sequential write().
The safest long-term fix is simple: treat fd_pwrite as a first-class positioned-I/O operation, route it to a backend that supports explicit offsets, and add regression tests that verify both data placement and file-position immutability. That resolves the bug at the semantic level, which is what the failing C test is really checking.