Conversation
9789b9f to
d5c4c5f
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
It's happening! |
This comment has been minimized.
This comment has been minimized.
Yes! Finally -- and no regressions this time! I have a pile of stacked PRs:
Now that I have some evidence that this PR make them faster, I'll go back and get them ready to review |
…er` (#21327) ~(Draft until I am sure I can use this API to make FileStream behave better)~ ## Which issue does this PR close? - part of #20529 - Needed for #21351 - Broken out of #20820 - Closes #21427 ## Rationale for this change I can get 10% faster on many ClickBench queries by reordeirng files at runtime. You can see it all working together here: #21351 To do do, I need to rework the FileStream so that it can reorder operations at runtime. Eventually that will include both CPU and IO. This PR is a step in the direction by introducing the main Morsel API and implementing it for Parquet. The next PR (#21342) rewrites FileStream in terms of the Morsel API ## What changes are included in this PR? 1. Add proposed `Morsel` API 2. Rewrite Parquet opener in terms of that API 3. Add an adapter layer (back to FileOpener, so I don't have to rewrite FileStream in the same PR) My next PR will rewrite the FileStream to use the Morsel API ## Are these changes tested? Yes by existing CI. I will work on adding additional tests for just Parquet opener in a follow on PR ## Are there any user-facing changes? No
acde88a to
6b79a6f
Compare
This comment has been minimized.
This comment has been minimized.
Co-authored-by: Oleks V <comphead@users.noreply.github.com>
|
I asked Claude to write some tests on Shared mode and it found some potential concerns |
Adds a `FileStreamState::Prefetch` variant that drives multiple planner I/O operations concurrently, so I/O for upcoming files overlaps with CPU decoding of the current file. In-flight morsel-producing work (pending I/O + ready planners + ready morsels + active reader) is capped at 20 to bound buffering. Enabled by default via `FileStreamBuilder` (legacy single-I/O `ScanState` remains available with `with_prefetch(false)`). Stacks on top of apache#21351. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
I think this is a real bug (for metric reporting). I have pushed a fix for it. Thank you @comphead |
Dandandan
left a comment
There was a problem hiding this comment.
I think it looks correct and the changes also look quite simple to me (after the refactoring PRs).
|
run benchmarks |
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing alamb/reschedule_io (323a5ca) to 7731130 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing alamb/reschedule_io (323a5ca) to 7731130 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing alamb/reschedule_io (323a5ca) to 7731130 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing alamb/reschedule_io (323a5ca) to 7731130 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing alamb/reschedule_io (323a5ca) to 7731130 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing alamb/reschedule_io (323a5ca) to 7731130 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
(With filter pushdown, Clicbench is under 18s!) |
|
Looking fwd it! |

Which issue does this PR close?
Rationale for this change
This PR finally enables dynamic work scheduling in the FileStream (so that if a task is done it can look at any remaining work)
This improves performance on queries that scan multiple files and the work is not balanced evenly across partitions in the plan (e.g. we have dynamic filtering that reduces work significantly)
It is the last of a sequence of several PRs:
ParquetOpenertoParquetMorselizer#21327What changes are included in this PR?
Note there are a bunch of other things that are NOT included in this PR, including
As @Dandandan proposes below, I expect we can work on those changes as follow on PRs.
Are these changes tested?
Yes by existing functional and benchmark tests, as well as new functional tests
Are there any user-facing changes?
Yes, faster performance (see benchmarks): #21351 (comment)