Skip to content

typetag::serde trait impls leak Postgres symbols into cargo test / coverage binaries (pgrx 0.18+) #2281

@philippemnoel

Description

@philippemnoel

Summary

This is a follow-up to (and replacement for) #2229, which was filed against pgrx 0.16 and framed around pgrx_embed. pgrx 0.18 removed the pgrx_embed helper binary, so the original symptom is gone — but the underlying problem with #[typetag::serde] and Postgres symbols resurfaces one layer down, in any cargo test binary built from a pgrx extension crate.

Filing fresh so the bug report reflects current pgrx reality.

Environment

  • pgrx 0.18+ (no pgrx_embed)
  • Reproduced on macOS (dyld flat-namespace) and Linux (glibc loader)
  • Reproduced under cargo test, cargo pgrx test, and cargo llvm-cov

Problem

An extension crate (crate-type = ["cdylib"]) that uses #[typetag::serde] on a trait impl — where the impl's methods call into pgrx::pg_sys — fails to load as a unit-test binary:

  • macOS: dyld: symbol not found in flat namespace '_BufferBlocks'
  • Linux: undefined symbol: CurrentMemoryContext / ErrorContext from the loader (even after -Wl,--unresolved-symbols=ignore-all lets ld finish)
  • cargo llvm-cov: coverage instrumentation disables DCE, so the same failure reproduces even on targets where release builds happen to elide the unreachable paths

The failure occurs regardless of whether any #[test] actually references the typetag-annotated type. A bare #[test] fn smoke() {} elsewhere in the crate is enough to trigger it.

Why it happens

#[typetag::serde] expands to inventory::submit!-style "life before main" registration that unconditionally references the trait impl methods. Those methods (in our case) transitively call into pgrx::pg_sys, which references extern globals like CurrentMemoryContext, ErrorContext, and BufferBlocks. Those globals are satisfied at runtime by the Postgres process image when the .so / .dylib is dlopened — but a cargo test binary is a standalone executable that is not loaded into Postgres, so those symbols are unresolved.

Since cargo test compiles the whole library into the test binary, and the typetag ctor registration is unconditional, the pg_sys symbols are pulled into the link/load closure whether or not tests exercise the code path. #[cfg(not(test))] on the impl mostly works but forces a stub impl to satisfy the upstream typetag-enabled trait, which is awkward.

Current workaround

Function-pointer indirection initialized in _PG_init, so the typetag ctor only references a thin trampoline and the real pg_sys-touching implementation is reachable only at runtime via the installed pointer. Test binaries never call _PG_init, so DCE elides the real implementation and the pg_sys references with it.

type BuildFn = fn(serde_json::Value, u32) -> anyhow::Result<Box<dyn Query>>;
static BUILD_FN: OnceLock<BuildFn> = OnceLock::new();

pub fn init_builder() { BUILD_FN.get_or_init(|| real_impl); }

#[typetag::serde]
impl SomeTrait for MyType {
    fn build(&self, ...) -> ... {
        BUILD_FN.get().expect("call init in _PG_init")(...)
    }
}

Works but forces every author of a typetag-serialized extension type to understand link closures, ctor registration, and pgrx's dlopen model.

What would help from pgrx

Open-ended — a few options that have come up:

  1. A pgrx-sanctioned pattern for "this code path should not be reachable from a cargo test binary," documented in the README.
  2. A way for extension authors to mark an item as "library-only, never link into the unit-test binary" that pgrx's build harness respects.
  3. Guidance / tooling that makes pg_sys extern globals weak-linked (or stubbed) under cfg(test) so cargo test binaries can load even if they never exercise those paths.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions