How to Fix: bindgen: Generates names which may have Rust keywords

6 min read

Rust keyword collisions in generated bindings are a classic codegen trap: bindgen mirrors identifiers from C or C++ headers, but Rust reserves words like type, match, crate, and self. When generated item names land on those reserved tokens, the output may fail to compile or become awkward to use unless the generator escapes, renames, or transforms them correctly.

Understanding the Root Cause

The issue appears because bindgen converts foreign declarations into Rust syntax, but Rust has a stricter identifier model than C and C++. In native headers, names such as type, union, fn, or mod may be perfectly legal depending on context. In Rust, those names are either reserved keywords or contextual keywords.

When bindgen emits code like a struct field, function, type alias, enum variant, or module using one of those identifiers directly, the generated file can become invalid Rust. For example, generated code such as pub type: ::std::os::raw::c_int is not valid because type is parsed as syntax, not as a normal name.

There are three technical dimensions to this bug:

  • Identifier translation: bindgen must map foreign names into Rust-safe names.
  • Keyword awareness: the code generator must recognize both strict and edition-specific Rust keywords.
  • Stable usability: even when code compiles, generated APIs should remain predictable for downstream crates.

In Rust, the standard escape hatch is the raw identifier syntax, such as r#type or r#match. This allows generated code to preserve the original foreign spelling while still compiling. If bindgen does not apply this transformation consistently, the generated bindings break.

Step-by-Step Solution

The cleanest fix is to ensure generated identifiers are sanitized before emission. In practice, you can solve this in one of three ways depending on whether you are using bindgen, patching bindgen, or working around the bug locally.

1. Reproduce the failure with a minimal header

Create a small C header that includes names colliding with Rust keywords.

/* wrapper.h */
struct demo {
    int type;
    int match;
};

enum kind {
    fn = 1,
    mod = 2,
};

Then generate bindings in a build script.

// build.rs
fn main() {
    let bindings = bindgen::Builder::default()
        .header("wrapper.h")
        .generate()
        .expect("Unable to generate bindings");

    bindings
        .write_to_file("src/bindings.rs")
        .expect("Couldn't write bindings");
}

If bindgen emits unescaped identifiers, compilation will fail in src/bindings.rs.

2. Verify whether your bindgen version already includes a fix

Before patching anything, update to the latest release of bindgen. Keyword handling has improved over time, and the issue may already be resolved in newer versions.

[build-dependencies]
bindgen = "0.69"

Then rebuild:

cargo clean
cargo build

If the generated output now contains forms like r#type, the problem is effectively solved.

3. Patch the generated names using raw identifiers

If you are fixing the generator itself, the core solution is to detect reserved names and emit them as raw identifiers.

Conceptually, the logic should look like this:

fn rust_safe_ident(name: &str) -> String {
    match name {
        "as" | "break" | "const" | "continue" | "crate" |
        "else" | "enum" | "extern" | "false" | "fn" |
        "for" | "if" | "impl" | "in" | "let" |
        "loop" | "match" | "mod" | "move" | "mut" |
        "pub" | "ref" | "return" | "self" | "Self" |
        "static" | "struct" | "super" | "trait" | "true" |
        "type" | "unsafe" | "use" | "where" | "while" |
        "async" | "await" | "dyn" => format!("r#{}", name),
        _ => name.to_string(),
    }
}

The exact location of this logic depends on bindgen internals, but the design principle is always the same: sanitize every emitted identifier at the final Rust code generation boundary.

4. Apply a local workaround if you cannot patch bindgen

If you are blocked on an upstream release, you can post-process the generated file in your build pipeline. This is less ideal than fixing bindgen, but it is practical.

// build.rs
use std::fs;

fn main() {
    let out_file = "src/bindings.rs";

    let bindings = bindgen::Builder::default()
        .header("wrapper.h")
        .generate()
        .expect("Unable to generate bindings");

    bindings
        .write_to_file(out_file)
        .expect("Couldn't write bindings");

    let content = fs::read_to_string(out_file).expect("read bindings");
    let content = content
        .replace(" pub type: ", " pub r#type: ")
        .replace(" pub match: ", " pub r#match: ");

    fs::write(out_file, content).expect("write patched bindings");
}

This workaround is acceptable for urgent builds, but it is fragile because it depends on exact text patterns.

5. Prefer allowlists and renaming strategies when appropriate

If only a small subset of declarations is needed, reducing generated surface area can avoid problematic identifiers entirely.

let bindings = bindgen::Builder::default()
    .header("wrapper.h")
    .allowlist_type("demo")
    .allowlist_var("kind_.*")
    .generate()
    .expect("Unable to generate bindings");

This does not fix the root bug, but it limits exposure and can simplify temporary mitigation.

6. Validate the generated API

After the fix, inspect the generated Rust and confirm that collisions are escaped correctly.

#[repr(C)]
pub struct demo {
    pub r#type: ::std::os::raw::c_int,
    pub r#match: ::std::os::raw::c_int,
}

Then confirm it compiles and remains usable from Rust:

mod bindings;

fn main() {
    let item = bindings::demo {
        r#type: 1,
        r#match: 2,
    };

    let _ = item.r#type + item.r#match;
}

If this succeeds, the issue is resolved correctly.

Common Edge Cases

  • Edition-specific keywords: Some words become problematic depending on the Rust edition, such as async, await, and dyn. A robust fix must account for modern Rust editions.
  • Enum variants and constants: The problem is not limited to struct fields. enum variants, type aliases, functions, and global constants can all collide with keywords.
  • Name collisions after sanitization: Two foreign names may map to the same Rust-safe identifier. For example, a manually prefixed rename strategy can accidentally create duplicates.
  • Generated modules: If bindgen groups declarations into modules, names like mod or super can break module generation in subtler ways than field emission.
  • Macro-generated headers: In complex C/C++ builds, preprocessor expansion may produce unexpected symbols that are not obvious from the original source header.
  • Downstream ergonomics: Raw identifiers compile, but they can be awkward for users. If you control the foreign API surface, a semantic rename may be more maintainable than exposing many r#name fields.

FAQ

Why does C or C++ allow these names if Rust rejects them?

Each language defines its own grammar and reserved words. A token that is valid as an identifier in a header may be reserved syntax in Rust, so FFI code generation must translate it safely.

Is using r#type or r#match the correct long-term fix?

Yes. Raw identifiers are the idiomatic Rust mechanism for preserving external names that collide with keywords. They are usually better than inventing inconsistent renames unless your project explicitly wants a Rust-native API layer.

Can I solve this without modifying bindgen?

Yes. You can upgrade bindgen, reduce generated declarations with allowlists, or post-process the output in build.rs. However, the most reliable fix is still in the generator itself so every emitted identifier is validated before code is written.

For teams maintaining Rust bindings at scale, the best resolution is to treat keyword sanitization as a mandatory part of code generation. That keeps generated bindings valid across Rust versions, avoids brittle manual patches, and makes future header changes much safer to absorb.

Leave a Reply

Your email address will not be published. Required fields are marked *