feat(fragindex): phase 1 — data structures + codecs#21
Closed
feat(fragindex): phase 1 — data structures + codecs#21
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 1 of the fragment-indexed two-tier search design (spec at
/Users/yperez/.claude/plans/msgfplus-fragment-index/design.md). Internal-only — adds the data-structure foundation with full unit-test coverage. No CLI surface, no runtime behaviour change.Safe to merge as a no-op addition; Phases 2-7 build on top.
What landed
540f194EliasFanostub + empty-list codece4b01d6EliasFanonaive int[] encoding4c7e68fEliasFano.Cursorzero-copy iteratorf129d55Fingerprint128128-bit b/y bit-set + popcounte450891Slab(immutable view) +SlabBuilder(writable, single-use)276fe08FragmentIndexStoreinterface +DirectStorein-memory implAll six commits are individually revertible.
Files
New package
edu.ucsd.msjava.fragindex:EliasFano.java— sorted-int[] codec with zero-copyCursoriteration. Naive layout; compact Elias-Fano bit-packing lands in Phase 4.Fingerprint128.java— 128-bit fragment fingerprint, low-64 = b-ion bits, high-64 = y-ion bits.setBIonBucket/setYIonBucket/popcountAnd.Slab.java— immutable read-only per-slab view.fingerprint(int)returns a fresh snapshot (no internal-state leak);fingerprintLoBits/HiBitsfor zero-allocation reads;bucketCursor(int)returnsEliasFano.Cursor.SlabBuilder.java— writable construction buffer. Single-use (IllegalStateExceptionafterfinish()).FragmentIndexStore.java— storage backend interface.DirectStore.java— in-memory impl of the store.Test plan
mvn -B verify— 184 tests pass (166 baseline + 18 new Phase 1 tests)DBScanner/MSGFPlus/ scoring code pathsdevCompactSuffixArrayor other external stateReviews
Each task went through spec compliance review + code quality review:
Not in this PR
Phase 2 (build pipeline via extended
BuildSA -buildFragIndex 1) will be its own PR on top. See/Users/yperez/.claude/plans/msgfplus-fragment-index/plan.mdfor the 7-phase plan; Phase 1 is gated on this PR merging.🤖 Generated with Claude Code