feat(retrieval): Phase 1 — typed schema and FT.CREATE translation#236
Conversation
3c8b814 to
6ce7dde
Compare
c14395b to
93458ff
Compare
|
Minor DX note: fresh-checkout tests fail until the kit is built Running Vitest resolves the workspace symlink through the kit's Totally fine to leave as-is if this is already planned for one of the next PRs in the stack — just flagging it so it doesn't get lost. |
…nslation - Add RetrievalSchema, FieldSpec, VectorSpec, FtCapabilities types in schema.ts - Implement pure buildFtCreateArgs in ft-create.ts: HNSW (6 pairs/12 params with defaults M=16 EF_CONSTRUCTION=200 EF_RUNTIME=10), FLAT (3 pairs/6 params), all three field types (text/tag/separator/numeric/sortable), metric mapping, textFields capability gate, dims/fieldName/algorithm-param validation - 24 table-driven tests, TDD red→green - Export all public types + builder from index.ts - Remove --passWithNoTests from test script
- Discriminated-union VectorSpec: HnswVectorSpec / FlatVectorSpec split so FLAT cannot carry HNSW params at the type level - Replace validateDims void + as-cast with requireDims narrowing guard - Add indexName/keyPrefix helpers with empty-name validation; export both - resolveVectorFieldName helper eliminates duplicated ?? 'embedding' - validateFlatHnswParams uses 'in' guards, accepts VectorSpec (no any) - METRIC_MAP typed as Record<VectorMetric, string> - Harmonize error messages to include offending value - Replace no-interpolation template literals with single-quoted strings - Prettier pass over all src files - 32 tests (up from 24): FLAT dims missing/invalid, empty/whitespace index name, indexName/keyPrefix unit cases; FLAT+HNSW param throw tests construct invalid objects via property mutation to avoid casts in production code
…sIndexNotFoundError
93458ff to
ee9fb6a
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ee9fb6a. Configure here.
| 'EF_CONSTRUCTION', | ||
| String(efConstruction), | ||
| 'EF_RUNTIME', | ||
| String(efRuntime), |
There was a problem hiding this comment.
HNSW tuning params unvalidated
Medium Severity
buildVectorArgs applies ?? for m, efConstruction, and efRuntime, so explicit NaN or non-finite values are forwarded into the FT.CREATE vector attribute list (e.g. M becomes the string NaN). dims is validated via requireDims, but HNSW tuning fields are not, so invalid schemas can produce server-rejected commands instead of a clear client error.
Reviewed by Cursor Bugbot for commit ee9fb6a. Configure here.
| for (const name of Object.keys(fields)) { | ||
| if (name.length === 0) { | ||
| throw new Error('Invalid field name: empty field name is not allowed'); | ||
| } |
There was a problem hiding this comment.
Whitespace-only schema field names
Low Severity
validateFieldNames treats only zero-length keys as invalid, while index names and vector fieldName reject whitespace-only strings via trim(). A schema field key consisting only of spaces is accepted and emitted in the FT.CREATE SCHEMA section, which is inconsistent validation and can yield confusing server failures.
Reviewed by Cursor Bugbot for commit ee9fb6a. Configure here.


Summary
Phase 1 of the Retrieval SDK plan, stacked on #234 (Phase 0). Adds the first code of
@betterdb/retrievalplus one deferred Phase 0 review item in the kit.packages/retrieval(@betterdb/retrieval0.1.0), mirroringvalkey-search-kit, with a workspace dep on the kitbuildFtCreateArgs(name, schema, capabilities?)translating the typed index schema (text / tag+separator / numeric+sortable fields; HNSW|FLAT vector as a discriminated union) into the full FT.CREATE argument vector — HNSW defaults M=16 / EF_CONSTRUCTION=200 / EF_RUNTIME=10 always emitted; exportedindexName()/keyPrefix()naming helpersFtCapabilities.textFields— valkey-search < 1.2 rejectsTEXT, so callers on older modules get an actionable error instead of a server failureisIndexNotFoundErrorinvalkey-search-kit: the broad'not found'substring match is now scoped to index errors ('not found'+'index'co-occurrence). Verified against live engines — valkey-search 1.2 emitsIndex with name '…' not found in database 0, Redis 8 emitsNo such index …; both stay matched, generickey not found-style messages no longer misclassify. The semantic-cache characterization lock was deliberately split into positive + negative cases for this.Test Plan
@betterdb/retrievalunit tests: 32/32 (table-driven, full-vector deep equality)@betterdb/valkey-search-kitunit tests: 32/32 (incl. empirically captured engine phrasings)@betterdb/semantic-cachesuite: 191/191 (characterization net intact)tscbuilds clean across the three packagesNote
Medium Risk
The tightened index-not-found heuristic changes runtime error handling for semantic-cache initialization and any other kit consumers; behavior is well-tested but mis-tuned matching could still mis-route FT.INFO failures.
Overview
Introduces
@betterdb/retrieval(0.1.0) as Phase 1 of the retrieval SDK: typed index schema (text / tag / numeric fields plus HNSW|FLAT vector specs) and purebuildFtCreateArgsthat emits the fullFT.CREATEargument vector, withindexName/keyPrefixhelpers andFtCapabilities.textFieldsto fail fast when TEXT fields are used on valkey-search < 1.2.isIndexNotFoundErrorin@betterdb/valkey-search-kitno longer treats every'not found'substring as a missing index; it now requires both'not found'and'index'(plus existing phrasings), so messages likekey not foundare not misclassified. Semantic-cache characterization tests were split into positive index-missing cases and a negative case where generic not-found errors surface asValkeyCommandErrorinstead of triggering index creation.Reviewed by Cursor Bugbot for commit ab188f1. Bugbot is set up for automated code reviews on this repo. Configure here.