fix: support Curvine LanceDB clone and conditional writes by jlon · Pull Request #824 · CurvineIO/curvine

jlon · 2026-05-14T11:07:32Z

What changed

adds Curvine-aware shallow clone_table wiring for curvine:// LanceDB databases
supports source version and source tag resolution for shallow clone targets
implements PutMode::Update with Curvine-backed conditional write locking and eTag checks
extends object-store semantic coverage for conditional updates, multipart interactions, and missing-object behavior
expands minicluster e2e coverage for LanceDB table lifecycle, clone-table flows, schema/version/tag behavior, and object-store conditional writes

Why

This is PR 4 in the LanceDB on Curvine stack. It builds on PR #823 by adding the LanceDB semantics that require conditional object updates and correct URI/path handling across clone operations.

Stack

feat: add curvine lancedb facade scaffold #799: facade scaffold, merged
feat: add curvine object store scaffold #801: object store/provider/session scaffold, merged
feat: complete Curvine LanceDB object store contract #823: working Curvine object-store contract
this PR: shallow clone and conditional write semantics

Validation

cargo fmt --all
cargo clippy -p curvine-lancedb-rs --all-targets --all-features -- -D warnings
cargo test -p curvine-lancedb-rs
cargo clippy -p curvine-tests --test lancedb_object_store_e2e -- -D warnings

szbr9486 · 2026-05-15T07:33:19Z

+        }
+    }
+
+    async fn release_object_write_lock(&self, guard: &ConditionalWriteLock) -> OsResult<()> {


The lock acquisition/release pair has two correctness gaps that LanceDB conditional writes will hit in production:

No ownership check on release. release_object_write_lock unconditionally sends LockType::UnLock against guard.path. If process A acquired the lock, hit the 120 s timeout doing nothing (or stalled on GC), and process B subsequently re-acquired the same lock and started its critical section, process A's release call will clear B's lock and let C charge in. The conditional-update etag check on disk still narrows the race window, but the staging+rename sequence is no longer the only writer between B's head() and B's rename(). Consider gating release on the same owner_id/pid tuple that acquired the lock (server-side compare-and-clear), or returning the previous lock state and refusing to clear a foreign owner.

No TTL / lease, no crash recovery. conditional_write_lock writes acquire_time: 0 and there's no heartbeat or expiry. If the LanceDB process is kill -9'd (OOM, deploy roll, machine reboot) mid-put_update, the advisory lock at /.curvine/lancedb/locks/<md5> stays held forever from Curvine's point of view. Every subsequent conditional write to that key will hit Timed out waiting for Curvine object write lock after 120 s and never recover, and the lock files accumulate at /.curvine/lancedb/locks/ with no GC. At minimum this needs:

A reasonable TTL/lease on the lock (so server reaps abandoned holders).

A non-zero acquire_time so server-side reaping has a basis.

A periodic cleanup path for the lock-file directory, or a documented operator runbook for the leak.

These are blockers for using PutMode::Update as a real "this is safe to commit metadata on" primitive, which is the whole point of the conditional path.

Checked this against the Curvine server lock implementation and I do not think the two core risks apply to the current code path.

Release is owner-scoped: LockMeta::set_flock removes by key = client_id + owner_id, and this facade stores the same generated owner_id in the guard used for unlock. A stale A unlock cannot remove B’s later lock because B has a different owner key.

TTL/acquire_time are server-side: LockMeta::set_flock overwrites acquire_time = LocalTime::mills(), and MasterFilesystem::set_lock passes master.lock_expire_time_ms() (default 5m). The client sends 0, but that is not persisted for acquired locks.

The lock files themselves may remain as marker files under /.curvine/lancedb/locks, but the held lock state is stored in inode lock metadata and is expired by the existing Curvine lock path. I am leaving this unchanged in this PR.

szbr9486 · 2026-05-15T07:33:43Z

+            .cloned()
+            .or_else(|| env::var(ClusterConf::ENV_CONF_FILE).ok())
+            .ok_or_else(missing_curvine_config_error)?;
+        Ok(format!("{CURVINE_SCHEME}${conf_path}"))


Two concerns about embedding the raw config-file path in the store prefix:

Cache key fragility. Lance uses the prefix as the dedup key in the ObjectStoreRegistry. Two callers pointing at the same Curvine cluster via different conf paths — /etc/curvine/cluster.toml vs ~/.curvine/cluster.toml, or absolute vs symlinked, or the conf file in two working dirs of the same binary — will get two distinct CurvineObjectStore instances, double the Curvine client connections, and double the curvine_session() allocations. For an integration like this where the intent is "one store per cluster", consider deriving the prefix from the cluster's master endpoint(s) parsed out of the conf (or a hash of the resolved cluster identity) rather than the raw file path.

Mandatory config at prefix-calculation time. Today calculate_object_store_prefix is called by Lance on essentially every URI parse, including paths that go through the registry but don't ultimately need a Curvine connection (read-only manifest probes, listing, etc.). Making the function fail with "Missing Curvine cluster configuration" when the env var isn't set is a hard coupling that surprised me reading clone_table in connection.rs; consider falling back to a stable curvine$<authority>/<path> identity when the conf isn't resolvable, and only failing in new_store where the configuration is actually used.

Fixed in d6f4199d.

calculate_object_store_prefix no longer embeds the raw config-file path and no longer fails just because config is absent during prefix calculation. With a config file it parses the Curvine cluster config and derives the registry identity from the resolved master endpoints; without config it falls back to a stable URI-derived identity and leaves the real configuration failure to new_store.

I also added facade tests for both behaviors.

szbr9486 · 2026-05-15T07:33:44Z

+                    store: CURVINE_SCHEME,
+                    source: e.to_string().into(),
+                })?;
+                if self.dir_contains_visible_file(&child).await? {


dir_contains_visible_file does a full recursive directory walk for every top-level subdirectory found during list_with_delimiter, which inflates the RPC count significantly:

list_with_delimiter(None) on a workspace with N top-level directories and on average M files per subtree will issue O(N * depth) list_status RPCs and walk O(N * M) files just to decide whether the directory should appear as a common_prefix. LanceDB calls list_with_delimiter on hot paths (manifest fanout, version table discovery), so this scales poorly as the dataset count in a Curvine workspace grows.

The walk also has no caching, so back-to-back list_with_delimiter calls pay the full cost again.

A couple of alternatives that keep the "don't return empty / staging-only prefix" semantics without the full walk:

Stop the recursion as soon as the first user-visible file is found — the current Box::pin recursion already does this, but it still has to do the full walk for genuinely-empty staging-only directories (which is the common case during multipart in-flight). A shallower test ("the subtree contains at least one file that isn't .curvine/...") is cheap if Curvine exposes a has_visible_descendant server-side primitive, or by special-casing the multipart/lock root paths (which are the only known "filler" sources right now) and skipping them at the listing level rather than after the fact.

Alternatively, drop the visibility pruning entirely and let LanceDB see the empty prefix — Lance's manifest discovery code can already tolerate prefixes that don't contain a .lance manifest.

Fixed in d6f4199d.

list_with_delimiter no longer calls dir_contains_visible_file, so it does not recursively walk each candidate directory. It now filters only known internal Curvine LanceDB directories (/.curvine/lancedb/multipart and /.curvine/lancedb/locks) and otherwise returns the immediate directory as a common_prefix, matching normal delimiter semantics and avoiding the hot-path O(N*M) scan.

This was referenced May 14, 2026

fix: use safe Curvine commits and streamed object IO #825

Open

fix: prepare Curvine LanceDB crate for git usage #826

Open

szbr9486 reviewed May 15, 2026

View reviewed changes

fix: support Curvine LanceDB clone and conditional writes

2d63c0e

jlon force-pushed the feat/lancedb-on-curvine-pr04 branch from 584ebe8 to 2d63c0e Compare May 15, 2026 08:57

jlon changed the base branch from feat/lancedb-on-curvine-pr03 to main May 15, 2026 08:58

fix: tighten Curvine LanceDB store identity and listing

d6f4199

szbr9486 approved these changes May 15, 2026

View reviewed changes

szbr9486 merged commit 258a58d into CurvineIO:main May 15, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: support Curvine LanceDB clone and conditional writes#824

fix: support Curvine LanceDB clone and conditional writes#824
szbr9486 merged 2 commits into
CurvineIO:mainfrom
jlon:feat/lancedb-on-curvine-pr04

jlon commented May 14, 2026

Uh oh!

szbr9486 May 15, 2026

Uh oh!

jlon May 15, 2026

Uh oh!

szbr9486 May 15, 2026

Uh oh!

jlon May 15, 2026

Uh oh!

szbr9486 May 15, 2026

Uh oh!

jlon May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jlon commented May 14, 2026

What changed

Why

Stack

Validation

Uh oh!

szbr9486 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

jlon May 15, 2026

Choose a reason for hiding this comment

Uh oh!

szbr9486 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

jlon May 15, 2026

Choose a reason for hiding this comment

Uh oh!

szbr9486 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

jlon May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants