fix(codegen): length-prefix composite entity_id; NULL-safe; no collision#77
Merged
Conversation
…ollision Closes #44. `build_entity_id_expr` joined composite-PK columns with `'::'`. Any PK value containing `::` collapsed to a different PK's encoded form (the documented collision). Worse, on PostgreSQL `||` returns NULL for the whole expression if any operand is NULL, so any nullable composite-PK column silently broke entity_id generation. Switch to **length-prefix encoding**: each column emits `LENGTH(CAST(col AS TEXT))::text || ':' || CAST(col AS TEXT)` and the parts are concatenated with no inter-column separator. Explicit lengths disambiguate column boundaries, so distinct PK values across rows can never produce the same encoding — regardless of what characters the values contain. NULL handling: each part is wrapped in `COALESCE(..., 'N')` so NULL encodes as the literal `'N'`. Distinguishable from empty string (encodes as `'0:'`) and from values starting with `N` (those carry a length prefix). Side effect: also fixes the Postgres NULL-propagation bug. The issue recommended SHA-256 hashing. Length-prefix achieves the same "no collision risk" property using only plain SQL, no extensions (pgcrypto / SQLite hash extension) and no Postgres-version dependency. The "uniform length" property is sacrificed but isn't needed for correctness — only as an indexing hint, which isn't covered by the acceptance criteria. Tests: - `test_entity_id_expr_composite_pk`: asserts the new shape (LENGTH, COALESCE, no '::'). - `test_entity_id_expr_composite_no_separator_collision`: distinct column counts produce distinct shapes; each column gets exactly one length-prefix block; no '::' anywhere. - `test_entity_id_expr_composite_mongodb_uses_plus_concat`: MongoDB branch uses `+` (not `||`) per the existing convention. `cargo clippy --all-targets -- -D warnings` clean; 34 unit tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Per V-L2-B2: composite-PK entity_id was joining columns with
::, which collides whenever a PK value contains::, and (on Postgres) returned NULL whenever any operand was NULL.Switch to length-prefix encoding: each column emits
LENGTH(CAST(col AS TEXT))::text || '':'' || CAST(col AS TEXT)and parts are concatenated with no inter-column separator. Explicit lengths disambiguate column boundaries — no separator string can collide. NULL gets the literal'N'(distinguishable from empty string'0:'and from values starting withNwhich carry a length prefix).The issue recommended SHA-256 hashing. Length-prefix achieves the same "no collision risk" property without needing pgcrypto / SQLite hash extensions / Postgres-version dependencies. Tradeoff (no uniform output length) isn't covered by the acceptance criteria.
Closes
Test plan
cargo clippy --all-targets -- -D warningscleancargo test --lib --bins34/34 pass (2 new)test_entity_id_expr_composite_no_separator_collisionchecks distinct column counts produce distinct shapestest_entity_id_expr_composite_mongodb_uses_plus_concatkeeps MongoDB lane's+operator