test: add MLT coverage tests for edge cases by poyrazK · Pull Request #114 · poyrazK/cloudSearch

poyrazK · 2026-06-10T15:45:17Z

Summary

Add 5 new MLT coverage tests to crates/cloudsearch-index/tests/coverage.rs:
- mlt_with_doc_id_not_found_returns_error: nonexistent doc_id returns empty results
- mlt_with_like_and_empty_fields_auto_infers_from_like_json: empty fields list auto-inferred from like JSON keys
- mlt_with_min_word_length_filters_short_terms: min_word_length filters short tokens
- mlt_with_max_word_length_filters_long_terms: max_word_length filters long tokens
- mlt_all_terms_filtered_returns_empty_or_error: all terms filtered by min_term_freq returns empty
All 9 MLT tests in coverage.rs now pass

Summary by CodeRabbit

Tests

Added five new integration tests for More-Like-This (MLT) functionality, covering: non-existent doc IDs, field auto-inference, term filtering by word length constraints, and scenarios where all reference terms are filtered out.

- mlt_with_doc_id_not_found_returns_error: nonexistent doc_id returns empty - mlt_with_like_and_empty_fields_auto_infers_from_like_json: empty fields list auto-inferred from like JSON keys, terms extracted from like content - mlt_with_min_word_length_filters_short_terms: min_word_length=4 filters single-char tokens - mlt_with_max_word_length_filters_long_terms: max_word_length=4 filters tokens longer than threshold - mlt_all_terms_filtered_returns_empty_or_error: all terms filtered by min_term_freq returns empty results All 9 MLT tests in coverage.rs now pass (previously 4, now 9).

coderabbitai · 2026-06-10T15:46:28Z

Warning

Review limit reached

@poyrazK, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 13 minutes and 23 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7da69239-331a-4c25-8e9a-4eeaa16849ec

📥 Commits

Reviewing files that changed from the base of the PR and between 0856717 and 29dd40a.

📒 Files selected for processing (1)

rust/crates/cloudsearch-index/tests/coverage.rs

📝 Walkthrough

Walkthrough

Five new integration tests extend MLT (More-Like-This) query coverage in the cloudsearch-index test suite. Tests cover non-existent document handling, field auto-inference, word-length term filtering, and edge cases where filtering eliminates all reference terms.

Changes

MLT Query Test Coverage

Layer / File(s)	Summary
MLT Query Result Handling and Field Inference `rust/crates/cloudsearch-index/tests/coverage.rs`	Tests verify that MLT queries handle non-existent document IDs gracefully by returning empty results, and auto-infer search fields from the provided `like` JSON when the `fields` parameter is empty.
MLT Term Filtering and Boundary Conditions `rust/crates/cloudsearch-index/tests/coverage.rs`	Tests validate that MLT queries correctly filter reference terms using `min_word_length` and `max_word_length` constraints, still matching when longer or shorter tokens remain; and handle the edge case where all reference terms are filtered out via `min_term_freq`, returning zero hits.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A rabbit searched for "docs that match,"
When fields went missing—watch the catch!
With min and max word lengths in place,
And filtering so full of grace,
Five tests now guard the MLT race! 🐰✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding MLT (More-Like-This) coverage tests for edge cases, which aligns with the changeset that adds five new integration tests.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/autocomplete-suggest

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

rust/crates/cloudsearch-index/tests/coverage.rs (2)
973-973: ⚡ Quick win

Comment inaccuracy: reference term should be "rustacean", not "rust".

The comment states Reference terms: "rust" (from title) but the like JSON on line 979 specifies "title": "rustacean", not "rust". The tokenizer will extract "rustacean" as the reference term from the title field. The test still passes because "systems" matches doc1's body, but the comment is misleading.
📝 Suggested fix
-    // Reference terms: "rust" (from title) and "systems" (from body)
+    // Reference terms: "rustacean" (from title) and "systems" (from body)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/crates/cloudsearch-index/tests/coverage.rs` at line 973, Update the
misleading comment that currently says 'Reference terms: "rust" (from title) and
"systems" (from body)' to reflect the actual title value "rustacean" (i.e.,
change "rust" to "rustacean"); locate the comment near the test that contains
the JSON `title`/`body` ("like" JSON) in the coverage test and modify the
reference-term text so it correctly lists "rustacean" and "systems".
888-888: ⚡ Quick win

Test name is misleading.

The test name says returns_error but the test checks for empty results (total == 0), not an error. Based on context snippet 1 (lib.rs:887-907), when build_mlt_bool_query returns an error for a missing doc_id, Index::search returns a SearchResponse with total=0 and empty hits rather than propagating the error. Consider renaming to mlt_with_doc_id_not_found_returns_empty_results for clarity.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/crates/cloudsearch-index/tests/coverage.rs` at line 888, Rename the test
function mlt_with_doc_id_not_found_returns_error to reflect its actual assertion
(it expects empty results), e.g.,
mlt_with_doc_id_not_found_returns_empty_results; update the async test function
name and any references to it so it matches the new descriptive name and makes
clear it asserts Index::search returns a SearchResponse with total == 0 and
empty hits when build_mlt_bool_query can't find the doc_id.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@rust/crates/cloudsearch-index/tests/coverage.rs`:
- Line 973: Update the misleading comment that currently says 'Reference terms:
"rust" (from title) and "systems" (from body)' to reflect the actual title value
"rustacean" (i.e., change "rust" to "rustacean"); locate the comment near the
test that contains the JSON `title`/`body` ("like" JSON) in the coverage test
and modify the reference-term text so it correctly lists "rustacean" and
"systems".
- Line 888: Rename the test function mlt_with_doc_id_not_found_returns_error to
reflect its actual assertion (it expects empty results), e.g.,
mlt_with_doc_id_not_found_returns_empty_results; update the async test function
name and any references to it so it matches the new descriptive name and makes
clear it asserts Index::search returns a SearchResponse with total == 0 and
empty hits when build_mlt_bool_query can't find the doc_id.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 02c52d8d-d733-4b50-930f-fb382c984bd1

📥 Commits

Reviewing files that changed from the base of the PR and between 2be8ac8 and 0856717.

📒 Files selected for processing (1)

rust/crates/cloudsearch-index/tests/coverage.rs

- mlt_with_like_and_empty_fields: fix weak >=1 check to verify both docs match (total==2) with correct ranking (doc1 > doc2) - mlt_with_max_word_length: fix weak >=1 check to verify total==1 and doc2 is the only match

poyrazK

It's okay to merge

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

test(index): strengthen MLT assertions — verify matched doc IDs

29dd40a

- mlt_with_like_and_empty_fields: fix weak >=1 check to verify both docs match (total==2) with correct ranking (doc1 > doc2) - mlt_with_max_word_length: fix weak >=1 check to verify total==1 and doc2 is the only match

poyrazK commented Jun 10, 2026

View reviewed changes

poyrazK merged commit a3fc0e4 into main Jun 10, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add MLT coverage tests for edge cases#114

test: add MLT coverage tests for edge cases#114
poyrazK merged 2 commits into
mainfrom
feature/autocomplete-suggest

poyrazK commented Jun 10, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated Code Review Effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

poyrazK left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

poyrazK commented Jun 10, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Summary by CodeRabbit

Tests

Uh oh!

coderabbitai Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated Code Review Effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

poyrazK left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

poyrazK commented Jun 10, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading