feat(suggest): add autocomplete / as-you-type suggest endpoint#113
Conversation
- Add SuggestRequest, SuggestResponse, Suggestion types to cloudsearch-common
- Create suggest_index.rs with SuggestIndex, SuggestEntry, SuggestReader
- Create suggest_writer.rs for binary file read/write
- Add suggest_readers field to IndexHandle, build suggest index during flush
- Add POST /{index}/_suggest API endpoint with multi-field support
- Binary search for O(log n + m) prefix lookup
- Score = doc_freq / n_docs for popularity-based ranking
|
Warning Review limit reached
More reviews will be available in 2 minutes and 46 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (10)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- doc_freq: count unique docs per term, not token occurrences
Use BTreeSet<doc_id> per term per field in build_suggest_index
to deduplicate within each document before counting.
Fix: 'foo foo foo' + 'foo bar' → doc_freq('foo') = 2, not 4.
- NaN panic: replace .unwrap() with .unwrap_or(Ordering::Equal)
in suggest() sort comparator. NaN scores now sort to equality
instead of panicking.
- Stale suggest_readers: reload all suggest sidecar readers
in apply_merge_plan after manifest update, mirroring the
positions_readers reload pattern. Prevents stale readers
pointing to non-existent sidecar files after a merge.
- Empty prefix: return Vec::new() for empty prefix in
suggest_for_field(). Previously '' returned the entire
vocabulary via starts_with('').
- docs: add POST /{index}/_suggest to api-v1.md with request/
response shapes and implementation notes.
- docs: add ADR 0004 for the autocomplete suggest index design.
Summary
API
Request:
{ "prefix": "elast", "fields": {"title": 1.0, "description": 0.5}, "size": 10 }Response:
{ "suggestions": [ {"text": "elastic", "score": 0.5, "doc_freq": 10, "field": "title"}, {"text": "elasticsearch", "score": 0.25, "doc_freq": 5, "field": "title"} ] }Performance