How to Fix: File size do not extend.

6 min read

A file that refuses to grow after writing past its current end usually points to one thing: the filesystem or runtime layer is not correctly handling file extension, truncate semantics, or sparse-write behavior. If your C test writes beyond EOF and the reported size does not increase, the bug is typically in how lseek, write, or metadata updates are implemented.

The issue titled “File size do not extend” is usually reproduced with a C program that opens a file, seeks beyond the current end, writes data, and then checks the resulting size. On a correct implementation, the file size must expand to include the last written byte. Any gap between the old EOF and the new write offset should behave like a zero-filled hole in a sparse file, or at minimum the inode size must still be updated to the new logical end of file.

Understanding the Root Cause

At the operating system and filesystem level, a file has both content storage and metadata. One critical metadata field is st_size, which represents the logical size of the file. When code performs an lseek(fd, offset, SEEK_SET) to move beyond the current EOF and then calls write(), the expected behavior is:

  • The write occurs at the requested offset.
  • If the offset is beyond the current EOF, the file is logically extended.
  • The inode or file metadata updates st_size to offset + bytes_written if that value exceeds the previous size.
  • The unwritten gap is treated as zeros when later read.

If the file size does not extend, one of these implementation bugs is usually present:

  • The write path stores bytes but never updates inode size.
  • The seek position is updated in memory, but the filesystem ignores writes past current allocation limits.
  • The code updates file blocks but only changes size when writing exactly at EOF, not beyond it.
  • The metadata is updated in memory but not persisted or returned correctly through stat or fstat.
  • The implementation confuses allocated size with logical file size.

In short, this bug happens because the system is not honoring the standard POSIX expectation that a successful write beyond EOF must extend the file’s logical size.

Step-by-Step Solution

The fix belongs in the file write implementation, not in the test case. The test is correctly validating standard file behavior.

1. Reproduce the bug with a minimal test

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>

static void print_file_size(int fd) {
    struct stat st;
    if (fstat(fd, &st) == -1) {
        perror("fstat");
        exit(1);
    }
    printf("file size = %lld\n", (long long)st.st_size);
}

int main(void) {
    int fd = open("testfile.bin", O_CREAT | O_RDWR | O_TRUNC, 0644);
    if (fd == -1) {
        perror("open");
        return 1;
    }

    print_file_size(fd);

    if (lseek(fd, 4096, SEEK_SET) == -1) {
        perror("lseek");
        return 1;
    }

    if (write(fd, "X", 1) != 1) {
        perror("write");
        return 1;
    }

    print_file_size(fd);
    close(fd);
    return 0;
}

Expected result: the final file size should be 4097.

2. Inspect the write path in your filesystem or runtime

Look for the function responsible for handling writes to regular files. The implementation should calculate the final written position:

new_end = file_offset + bytes_written;

Then it must compare that value against the current inode size:

if (new_end > inode->size) {
    inode->size = new_end;
}

This sounds simple, but many broken implementations only do something like this instead:

if (file_offset == inode->size) {
    inode->size += bytes_written;
}

That logic fails whenever the write starts beyond EOF rather than exactly at EOF.

3. Ensure sparse extension is supported correctly

If the write begins beyond current EOF, the gap must still count toward logical size. You do not need to physically write zeros into every skipped byte unless your design requires it. But reads from the gap must behave as zero-filled bytes.

off_t end_pos = offset + written;
if (end_pos > inode->size) {
    inode->size = end_pos;
    mark_inode_dirty(inode);
}

If your system tracks blocks explicitly, make sure reads from unallocated ranges inside st_size return zeros rather than garbage or EOF.

4. Update metadata before returning success

A common bug is writing the data buffer successfully but delaying or skipping inode persistence. Ensure that after a successful write, the file size visible through stat and fstat reflects the new value.

ssize_t fs_write(struct file *f, const void *buf, size_t count) {
    off_t start = f->pos;
    ssize_t written = write_data_to_inode(f->inode, start, buf, count);
    if (written < 0) {
        return written;
    }

    f->pos += written;

    off_t new_size = start + written;
    if (new_size > f->inode->size) {
        f->inode->size = new_size;
        persist_inode(f->inode);
    }

    return written;
}

5. Verify your stat implementation

If writing is correct but the size still appears unchanged, inspect the stat/fstat path. It may be returning stale cached metadata or reading the wrong inode field.

int fs_fstat(struct file *f, struct stat *st) {
    memset(st, 0, sizeof(*st));
    st->st_size = f->inode->size;
    st->st_mode = f->inode->mode;
    return 0;
}

6. Add regression tests

This issue should be covered with multiple scenarios:

/* write exactly at EOF */
/* write beyond EOF with lseek */
/* write multiple bytes beyond EOF */
/* reopen file and verify persisted size */
/* read from gap and confirm zero-fill behavior */

Example regression expectations:

offset = 1000, write 1 byte  => size becomes 1001
offset = 1000, write 20 bytes => size becomes 1020
existing size = 50, write at 10 => size stays 50 unless write crosses EOF

Common Edge Cases

  • Partial writes: If only part of the buffer is written, update the size using the actual number of bytes written, not the requested count.
  • Integer overflow: Large offsets can overflow 32-bit arithmetic. Use proper off_t handling.
  • O_APPEND semantics: If the file is opened with append mode, the actual write offset should be EOF, regardless of a prior seek.
  • Read-after-write consistency: After extending the file, reading from the hole between old EOF and new data should return zeros.
  • Metadata caching bugs: If your VFS, inode cache, or page cache is stale, stat may show the old size even though the write path updated it.
  • Persistence issues: The size may appear correct until close, then revert after reopen if the inode update is never flushed.
  • Block allocation failures: If extending the file requires allocation and allocation fails, the write must return an error or a short count consistently.

FAQ

Why should a file grow even if I write far beyond the current end?

Because standard file semantics treat the highest written byte as the new logical end of file. The bytes in between form a hole and must be readable as zeros.

Does extending a file require physically filling the gap with zero bytes?

No. Many filesystems support sparse files, where the gap is logically zero-filled without consuming physical storage for every skipped byte.

What if write() succeeds but stat() still shows the old size?

Then the bug is likely in metadata synchronization, inode persistence, or the stat/fstat implementation rather than the actual data write path.

The core fix is straightforward: after every successful write, compute the end offset and update the inode’s logical file size whenever that end offset exceeds the current size. Once that metadata path is correct, the test case for “File size do not extend” should pass reliably.

Leave a Reply

Your email address will not be published. Required fields are marked *