feat(retrieval): Phase 4 — query (vector + filters + hybrid)#242
feat(retrieval): Phase 4 — query (vector + filters + hybrid)#242jamby77 wants to merge 1 commit into
Conversation
- Add buildFtSearchQuery: KNN query string with TAG/NUMERIC filter clauses
- Add Retriever.query: embed text or accept a precomputed vector, FT.SEARCH,
map rows to { id, score, fields, text } stripping reserved fields
- Reject both/neither text|vector; empty results return []
- Support optional hybrid:'rerank' via an injectable rerankFn
- Export query types, buildFtSearchQuery, SCORE_FIELD
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit bf41ae5. Configure here.
| `Query vector dimension mismatch: index expects ${declared}, got ${options.vector.length}`, | ||
| ); | ||
| } | ||
| return options.vector; |
There was a problem hiding this comment.
Vector query skips inferred dims
Medium Severity
resolveQueryVector only compares a precomputed vector to schema.vector.dims. When dims is omitted and the index dimension was inferred via embedFn (as in existing upsert/create flows), text queries still validate through embed → resolveDims, but a vector query accepts any length and can reach FT.SEARCH with a mismatched embedding.
Reviewed by Cursor Bugbot for commit bf41ae5. Configure here.
| this.schema = options.schema; | ||
| this.capabilities = options.capabilities; | ||
| this.embedFn = options.embedFn; | ||
| this.rerankFn = options.rerankFn; |
There was a problem hiding this comment.
Upsert allows reserved score field
Medium Severity
This change reserves __score at index creation and maps KNN results with score: Number(hit.fields[__score]), but assertNoReservedFields still only blocks __text and the vector field. Upsert can persist a document __score hash field that may overwrite or mix with the KNN alias in parsed FT.SEARCH fields, yielding incorrect hit scores.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit bf41ae5. Configure here.


Phase 4 —
query(vector + filters + hybrid)Stacked on #241 (Phase 3) — base is the Phase 3 branch, not master.
What's new
buildFtSearchQuery(schema, k, filter?)— pure builder:(filter)=>[KNN k @vec $vec AS __score]; TAG →@f:{escapeTag(v)}, NUMERIC →@f:[v v]. Throws on unknown fields, TEXT fields, and non-numeric values for numeric fields.Retriever.query(options)— embedstextor uses a precomputedvector(exactly one), runsFT.SEARCH … PARAMS 2 vec … LIMIT 0 k DIALECT 2, maps rows →{ id (prefix stripped), score, text, fields }; empty →[]. Validateskis a positive integer and that a precomputed vector matches the index dimension.hybrid: 'rerank'via an injectablererankFn(queryText, hits)(requiresrerankFn+text).fields.tscentralizes the reserved field names (__text,__score);buildFtCreateArgsnow rejects schema fields using them.Public API
Exports
buildFtSearchQuery,QueryFilter,QueryHit,QueryOptions,RerankFn, and theTEXT_FIELD/SCORE_FIELDconstants.Tests
20 unit tests (6 builder + 14 query) plus reserved-field guards; full package suite 76/76 green,
tsc --noEmit+ prettier clean.Review-driven changes (pre-PR review)
Added:
kpositive-integer validation, NUMERIC filter value-type check, precomputed-vector dimension check, reserved-field guard at index creation (via sharedfields.ts), and a stronger rerank assertion verifying it receives mapped hits.Deferred (not in this phase)
RETURNclause to avoid shipping embedding bytes for every hit — Phase 5 (perf/observability).>kcandidates then trim) — the plan scopesrerankto reordering k; over-fetch is a future enhancement.scoresemantics: KNN returns distance (smaller = closer); the field is namedscoreper the spec — worth documenting for callers.Note
Medium Risk
New search/query path touches embedding, FT.SEARCH construction, and hit mapping; mistakes could affect result correctness or filter semantics, but changes are well-validated in unit tests and don't alter auth or persistence beyond search.
Overview
Adds Phase 4 query to the retrieval package: KNN search over Valkey/RediSearch with optional tag/numeric filters and an optional post-search rerank hook.
buildFtSearchQuerybuilds the FT.SEARCH dialect string (filter=>[KNN k @vec $vec AS __score]), with TAG escaping and AND-combined clauses; it rejects unknown fields, TEXT filters, and non-numeric values on numeric fields.Retriever.queryacceptstextor a precomputedvector(mutually exclusive), validateskand vector dimensions, runsFT.SEARCHwithPARAMS/DIALECT 2, and maps hits to{ id, score, text, fields }(stripping key prefix and internal__text/__score/embedding fields).hybrid: 'rerank'calls an injectablererankFnwhentextis provided.fields.tscentralizes__textand__score;buildFtCreateArgsnow rejects schema fields with those reserved names. Public exports add query types andbuildFtSearchQuery. Unit tests cover the builder, query path, reserved fields, and rerank validation.Reviewed by Cursor Bugbot for commit bf41ae5. Bugbot is set up for automated code reviews on this repo. Configure here.