feat(retrieval): Phase 1 — typed schema and FT.CREATE translation by jamby77 · Pull Request #236 · BetterDB-inc/monitor

jamby77 · 2026-06-12T12:11:10Z

Summary

Phase 1 of the Retrieval SDK plan, stacked on #234 (Phase 0). Adds the first code of @betterdb/retrieval plus one deferred Phase 0 review item in the kit.

Scaffold packages/retrieval (@betterdb/retrieval 0.1.0), mirroring valkey-search-kit, with a workspace dep on the kit
Pure buildFtCreateArgs(name, schema, capabilities?) translating the typed index schema (text / tag+separator / numeric+sortable fields; HNSW|FLAT vector as a discriminated union) into the full FT.CREATE argument vector — HNSW defaults M=16 / EF_CONSTRUCTION=200 / EF_RUNTIME=10 always emitted; exported indexName() / keyPrefix() naming helpers
TEXT field emission gated via FtCapabilities.textFields — valkey-search < 1.2 rejects TEXT, so callers on older modules get an actionable error instead of a server failure
Tighten isIndexNotFoundError in valkey-search-kit: the broad 'not found' substring match is now scoped to index errors ('not found' + 'index' co-occurrence). Verified against live engines — valkey-search 1.2 emits Index with name '…' not found in database 0, Redis 8 emits No such index …; both stay matched, generic key not found-style messages no longer misclassify. The semantic-cache characterization lock was deliberately split into positive + negative cases for this.

Test Plan

@betterdb/retrieval unit tests: 32/32 (table-driven, full-vector deep equality)
@betterdb/valkey-search-kit unit tests: 32/32 (incl. empirically captured engine phrasings)
@betterdb/semantic-cache suite: 191/191 (characterization net intact)
semantic-cache integration suite vs live valkey-bundle (valkey-search 1.2, port 6384): 13/13
tsc builds clean across the three packages

Stacked PR: base is feature/retrieval-sdk-valkey-search-kit (#234). After #234 merges, this will be rebased onto master and retargeted — do not delete the base branch before that.

Note

Medium Risk
The tightened index-not-found heuristic changes runtime error handling for semantic-cache initialization and any other kit consumers; behavior is well-tested but mis-tuned matching could still mis-route FT.INFO failures.

Overview
Introduces @betterdb/retrieval (0.1.0) as Phase 1 of the retrieval SDK: typed index schema (text / tag / numeric fields plus HNSW|FLAT vector specs) and pure buildFtCreateArgs that emits the full FT.CREATE argument vector, with indexName / keyPrefix helpers and FtCapabilities.textFields to fail fast when TEXT fields are used on valkey-search < 1.2.

isIndexNotFoundError in @betterdb/valkey-search-kit no longer treats every 'not found' substring as a missing index; it now requires both 'not found' and 'index' (plus existing phrasings), so messages like key not found are not misclassified. Semantic-cache characterization tests were split into positive index-missing cases and a negative case where generic not-found errors surface as ValkeyCommandError instead of triggering index creation.

^{Reviewed by Cursor Bugbot for commit ab188f1. Bugbot is set up for automated code reviews on this repo. Configure here.}

KIvanow · 2026-06-12T15:40:52Z

Minor DX note: fresh-checkout tests fail until the kit is built

Running pnpm --filter @betterdb/semantic-cache test on a clean checkout currently fails with:

Error: Failed to resolve entry for package "@betterdb/valkey-search-kit".
The package may have incorrect main/module/exports specified in its package.json.
  ❯ src/utils.ts:8:1

Vitest resolves the workspace symlink through the kit's main: ./dist/index.js, so the kit's dist/ has to exist before semantic-cache (and soon retrieval, once it actually imports the kit) tests can run. A turbo test → ^build dependency, or pointing the kit's dev-time resolution at src/ (e.g. via publishConfig), would make pnpm test work out of the box.

Totally fine to leave as-is if this is already planned for one of the next PRs in the stack — just flagging it so it doesn't get lost.

…nslation - Add RetrievalSchema, FieldSpec, VectorSpec, FtCapabilities types in schema.ts - Implement pure buildFtCreateArgs in ft-create.ts: HNSW (6 pairs/12 params with defaults M=16 EF_CONSTRUCTION=200 EF_RUNTIME=10), FLAT (3 pairs/6 params), all three field types (text/tag/separator/numeric/sortable), metric mapping, textFields capability gate, dims/fieldName/algorithm-param validation - 24 table-driven tests, TDD red→green - Export all public types + builder from index.ts - Remove --passWithNoTests from test script

- Discriminated-union VectorSpec: HnswVectorSpec / FlatVectorSpec split so FLAT cannot carry HNSW params at the type level - Replace validateDims void + as-cast with requireDims narrowing guard - Add indexName/keyPrefix helpers with empty-name validation; export both - resolveVectorFieldName helper eliminates duplicated ?? 'embedding' - validateFlatHnswParams uses 'in' guards, accepts VectorSpec (no any) - METRIC_MAP typed as Record<VectorMetric, string> - Harmonize error messages to include offending value - Replace no-interpolation template literals with single-quoted strings - Prettier pass over all src files - 32 tests (up from 24): FLAT dims missing/invalid, empty/whitespace index name, indexName/keyPrefix unit cases; FLAT+HNSW param throw tests construct invalid objects via property mutation to avoid casts in production code

…sIndexNotFoundError

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit ee9fb6a. Configure here.}

cursor · 2026-06-15T11:07:56Z

+    'EF_CONSTRUCTION',
+    String(efConstruction),
+    'EF_RUNTIME',
+    String(efRuntime),


HNSW tuning params unvalidated

Medium Severity

buildVectorArgs applies ?? for m, efConstruction, and efRuntime, so explicit NaN or non-finite values are forwarded into the FT.CREATE vector attribute list (e.g. M becomes the string NaN). dims is validated via requireDims, but HNSW tuning fields are not, so invalid schemas can produce server-rejected commands instead of a clear client error.

^{Reviewed by Cursor Bugbot for commit ee9fb6a. Configure here.}

cursor · 2026-06-15T11:07:57Z

+  for (const name of Object.keys(fields)) {
+    if (name.length === 0) {
+      throw new Error('Invalid field name: empty field name is not allowed');
+    }


Whitespace-only schema field names

Low Severity

validateFieldNames treats only zero-length keys as invalid, while index names and vector fieldName reject whitespace-only strings via trim(). A schema field key consisting only of spaces is accepted and emitted in the FT.CREATE SCHEMA section, which is inconsistent validation and can yield confusing server failures.

^{Reviewed by Cursor Bugbot for commit ee9fb6a. Configure here.}

cursor Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread packages/retrieval/src/ft-create.ts

jamby77 force-pushed the feature/retrieval-sdk-valkey-search-kit branch from 3c8b814 to 6ce7dde Compare June 12, 2026 12:13

jamby77 force-pushed the feature/retrieval-sdk-phase1-schema-builder branch 2 times, most recently from c14395b to 93458ff Compare June 12, 2026 12:24

KIvanow approved these changes Jun 12, 2026

View reviewed changes

Base automatically changed from feature/retrieval-sdk-valkey-search-kit to master June 15, 2026 11:04

jamby77 added 6 commits June 15, 2026 14:05

feat(retrieval): scaffold @betterdb/retrieval package

4a337a5

fix(valkey-search-kit): scope not-found matching to index errors in i…

eab6b42

…sIndexNotFoundError

fix(retrieval): reject empty vector field name in buildFtCreateArgs

f5ae7bd

chore: normalize pnpm-lock after master rebase install

ee9fb6a

jamby77 force-pushed the feature/retrieval-sdk-phase1-schema-builder branch from 93458ff to ee9fb6a Compare June 15, 2026 11:05

cursor Bot reviewed Jun 15, 2026

View reviewed changes

chore(retrieval): sync pnpm-lock with master after rebase

ab188f1

jamby77 merged commit 2f8baf1 into master Jun 15, 2026
3 checks passed

jamby77 deleted the feature/retrieval-sdk-phase1-schema-builder branch June 15, 2026 11:11

github-actions Bot locked and limited conversation to collaborators Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(retrieval): Phase 1 — typed schema and FT.CREATE translation#236

feat(retrieval): Phase 1 — typed schema and FT.CREATE translation#236
jamby77 merged 7 commits into
masterfrom
feature/retrieval-sdk-phase1-schema-builder

jamby77 commented Jun 12, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

KIvanow commented Jun 12, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Jun 15, 2026

Uh oh!

cursor Bot Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jamby77 commented Jun 12, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

Uh oh!

KIvanow commented Jun 12, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 15, 2026

Choose a reason for hiding this comment

HNSW tuning params unvalidated

Uh oh!

cursor Bot Jun 15, 2026

Choose a reason for hiding this comment

Whitespace-only schema field names

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jamby77 commented Jun 12, 2026 •

edited by cursor Bot

Loading