Skip to content

Auto-detect starting ID from partition range in synchronize#20

Merged
stefanmb merged 4 commits intomasterfrom
stefanmb/autodetect_id
Feb 13, 2026
Merged

Auto-detect starting ID from partition range in synchronize#20
stefanmb merged 4 commits intomasterfrom
stefanmb/autodetect_id

Conversation

@stefanmb
Copy link

@stefanmb stefanmb commented Feb 12, 2026

Summary

  • When --start is omitted, synchronize now uses the intermediate table's
    partition range to compute the starting ID, instead of scanning the entire
    source table with an unbounded MIN(id).
  • Batch queries are also time-bounded to the partition range, preventing
    "no partition found" errors when source rows fall outside existing partitions.
  • Shared partition resolution helpers (transformIdValue,
    resolvePartitionContext, resolvePartitionTimeFilter) are consolidated
    in table.ts, eliminating duplication across synchronizer.ts and
    filler.ts.

Motivation

Fill has always resolved partition settings from the destination table and
applied a time filter to its batch queries. Synchronize was the odd one out —
it did a bare MIN(id) across the entire source table and had no time bounds
on batch fetches.

Callers no longer need to manually compute a starting ID based on partition
boundaries. The --start option still works for explicit overrides, but the
default behavior is now partition-aware.

What changed

  • table.ts: transformIdValue exported (with nullable/non-nullable
    overloads). New resolvePartitionContext returns settings, partitions, and
    time filter in one call. resolvePartitionTimeFilter is a convenience
    wrapper. partitions() signature widened to CommonQueryMethods.
  • synchronizer.ts: Uses shared helpers. init passes the time filter
    to sourceTable.minId() when --start is omitted. #fetchBatch applies
    the time filter to batch queries.
  • filler.ts: Replaced inline partition resolution and local
    transformIdValue with shared imports from table.ts.

Test plan

  • New test: verifies only in-range rows are synchronized when --start
    is omitted and partitions exist
  • Existing synchronize tests continue to pass
  • Integration tests with PGSLICE_URL

🤖 Generated with Claude Code

When --start is omitted, synchronize now resolves the partition time
range from the intermediate table and uses it to scope the initial
MIN(id) lookup and batch queries — matching how fill already works.

Shared partition resolution logic (transformIdValue, resolvePartitionContext,
resolvePartitionTimeFilter) is consolidated in table.ts, eliminating
duplication across synchronizer.ts and filler.ts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@stefanmb stefanmb requested a review from mthadley February 12, 2026 21:30
} else {
// Get max from dest - resume from where we left off (exclusive)
const destMaxId = await destTable.maxId(tx);
startingId = destMaxId ?? undefined;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the description it says it that fill auto-calculates the starting ID. Is that true? Asking because from the code above, it looks like it actually selects the greatest ID from the destination table (destTable generally the intermediate table).

So if you are resuming a previous fill, it'll start where it left off. But it's the first fill, then startingId will be undefined. Does this mean we are relying on the timeFilter to ensure that fill gets the oldest ID from the source table that would fit into the oldest partition?

Curious what you found from testing. This part of the code was always a bit murky to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description could've been clearer. The behaviour of fill is unchanged. The auto-detection is the new synchronize behaviour.

As you pointed out, during fill's first-run the startingId is null, so timeFilter does the scoping. What I changed here is to make synchronize partition-aware too by computing a time-bounded minId instead of an unbounded MIN(id).

I added a test ("scopes to partition range on first fill") that demonstrates the fill behaviour explicitly, the test inserts an in-range and out-of-range row and asserts only the in-range row gets filled.

The test is the fill counterpart to the synchronize test at https://github.com/workos/pgslice/pull/20/changes#diff-61fad5299499932598bc32e477627a209e7dfaa4be682eaac2b88db17c7b2ed3R292

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, appreciate you taking another look.

stefanmb and others added 3 commits February 13, 2026 16:47
Moves `resolvePartitionContext` and `resolvePartitionTimeFilter` from
free functions to `Table#partitionContext` and `Table#partitionTimeFilter`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@stefanmb stefanmb merged commit a1e81d3 into master Feb 13, 2026
7 checks passed
@stefanmb stefanmb deleted the stefanmb/autodetect_id branch February 13, 2026 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments