Skip to content

Remove batch_coalescing_below_network_boundaries rule#407

Draft
EdsonPetry wants to merge 2 commits intodatafusion-contrib:mainfrom
EdsonPetry:edson.petry/remove-coalesce-batch-exec
Draft

Remove batch_coalescing_below_network_boundaries rule#407
EdsonPetry wants to merge 2 commits intodatafusion-contrib:mainfrom
EdsonPetry:edson.petry/remove-coalesce-batch-exec

Conversation

@EdsonPetry
Copy link
Copy Markdown
Contributor

@EdsonPetry EdsonPetry commented Apr 16, 2026

Summary

Removes the batch_coalescing_below_network_boundaries optimizer rule that inserted the DataFusion-53-deprecated CoalesceBatchesExec below every network boundary. Closes #386.

Also removes the distributed.shuffle_batch_size config field and its DistributedExt setters, they existed only to configure this rule.

Why

  • DataFusion 53 deprecated CoalesceBatchesExec. The rule was carrying #[expect(deprecated)] to keep using it.
  • Issue Remove CoalesceBatchesExec from plans #386 hypothesized that the rule's data-copy overhead outweighed the wire-efficiency gains. Confirmed empirically, see benchmarks below.
  • At default configuration the rule was already a no-op (shuffle_batch_size=8192execution.batch_size=32768 -> rule early-returned). Real plans never got CoalesceBatchesExec inserted in production callers.

Benchmark findings

Ran on a 4-worker EC2 cluster in us-east-1. Three configurations exercised:

Config batch_size shuffle_batch_size CoalesceBatchesExec in TPC-H plans CoalesceBatchesExec in TPC-DS plans
main @ defaults 32,768 32,768 0 0
main with rule firing 32,768 65,536 137 1,091
this PR @ defaults 32,768 n/a 0 0
this PR @ batch_size=65,536 65,536 n/a 0 0

Fair-comparison: matched effective wire batch size

Both sides produce ~65K-row batches over the wire. Main via the rule's post-hoc coalescing (batch_size=32K, shuffle_batch_size=65K); this PR via native batching (batch_size=65K).

Suite Rule-active main This PR (batch_size=65K) Δ
TPC-H SF10 74,138 ms 66,037 ms 1.12× faster
TPC-DS SF1 144,019 ms 129,162 ms 1.12× faster

Top TPC-H wins: q18 −23%, q7 −19%, q17 −18%, q9 −16%, q22 −20%, q5 −13%.
Top TPC-DS wins: q34 −48%, q15 −43%, q32 −40%, q14 −24%, q72 −13% (absolute: −1s on a 7.7s query).

Confirms the hypothesis from #386: native-large-batch path beats coalesce-after-the-fact by ~12% across both suites. The rule was costing more in data copies than it saved in wire efficiency.

Migration note

The only known caller of distributed.shuffle_batch_size is our own benchmarks/cdk/bin/datafusion-bench.ts — updated in this PR. External callers wanting larger wire batches should set datafusion.execution.batch_size directly. As a bonus, they'll gain ~12% throughput.

…trib#386)

DataFusion 53 deprecated CoalesceBatchesExec. The optimizer rule that
inserted it below network boundaries is dropped, along with the
distributed.shuffle_batch_size config and its DistributedExt setters,
the CDK benchmark flag that fed it, and the doc/example plan snippets
that referenced it.
@EdsonPetry EdsonPetry force-pushed the edson.petry/remove-coalesce-batch-exec branch from 523eccf to 3ea11ef Compare April 16, 2026 21:17
@EdsonPetry EdsonPetry changed the title Remove batch_coalescing_below_network_boundaries rule (#386) Remove batch_coalescing_below_network_boundaries rule Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove CoalesceBatchesExec from plans

1 participant