Skip to content

Reduce mem usage lookups when indexing#6331

Open
PSeitz wants to merge 1 commit intoquickwit-oss:mainfrom
PSeitz:mem_check
Open

Reduce mem usage lookups when indexing#6331
PSeitz wants to merge 1 commit intoquickwit-oss:mainfrom
PSeitz:mem_check

Conversation

@PSeitz
Copy link
Copy Markdown
Collaborator

@PSeitz PSeitz commented Apr 22, 2026

Do mem_usage() calls once per partition group instead of once per doc.

Sort the incoming ProcessedDocBatch by partition and group with chunk_by so get_or_create_indexed_split and the index_writer mem_usage() calls are once per partition group instead of once per doc.

Context

mem_usage calls are more expensive in moshiki, checking it per doc will be too much overhead.

Sort the incoming ProcessedDocBatch by partition and group with
chunk_by so get_or_create_indexed_split and the index_writer
mem_usage() probes are done once per partition group instead of
once per doc.
@PSeitz PSeitz requested a review from guilload April 23, 2026 09:14
@PSeitz PSeitz changed the title sort batch docs by partition to reduce mem usage lookups Reduce mem usage lookups when indexing Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant