Skip to content

Feature/drs-server#218

Open
matthewpeterkort wants to merge 40 commits intomainfrom
feature/local-drs-server
Open

Feature/drs-server#218
matthewpeterkort wants to merge 40 commits intomainfrom
feature/local-drs-server

Conversation

@matthewpeterkort
Copy link
Copy Markdown
Contributor

@matthewpeterkort matthewpeterkort commented Mar 26, 2026

PR Summary (Reviewer Guide)

This PR is a large architectural migration of git-drs from older indexd-centric and transfer-style paths to a newer DRS/data-client-based flow with local-server support, stronger command/test coverage, and scoped bucket behavior.

See integration tests for use case details:

tests/
    e2e-gen3-remote-addurl.sh
    e2e-gen3-remote-full.sh
    e2e-local-full.sh

tests are run with an env file that looks like this

TEST_BUCKET_SECRET_KEY=
TEST_GITHUB_TOKEN=
TEST_COLLAB_USER_EMAIL=
TEST_PRINT_TOKEN=true
TEST_CREATE_BUCKET_BEFORE_TEST=true	
TEST_SERVER_MODE=remote 
TEST_GITHUB_MODE=true 
TEST_BUCKET_NAME=cbds
TEST_BUCKET=cbds
TEST_BUCKET_REGION=us-east-1
TEST_BUCKET_ACCESS_KEY=
TEST_BUCKET_ENDPOINT=
TEST_ORGANIZATION=calypr
TEST_PROJECT_ID=end_to_end_test
GEN3_PROFILE=local

Broad Strokes of What Changed

1) Client architecture shift (core change)

  • Consolidated around DRS/data-client style clients.
  • Added/expanded local client behavior for local DRS server flows.
  • Removed/retired legacy patterns that assumed previous indexd coupling.

Primary areas:

  • client/drs/*
  • client/local/*
  • client/client.go

2) Command workflow updates

  • Push/pull/prepush/precommit/addurl/delete flows were updated to match new client behavior.
  • Remote config handling and local remote support were tightened.

Primary areas:

  • cmd/addurl/*
  • cmd/prepush/*
  • cmd/precommit/*
  • cmd/delete/*
  • cmd/remote/*

3) Bucket scoping/mapping support

  • Added bucket map/scope resolution logic to support org/project-level routing and prefixes.
  • Added tests for mapping precedence and prefix handling.

Primary areas:

  • gitrepo/bucket_map.go
  • gitrepo/bucket_scope.go
  • related tests in gitrepo/*_test.go

4) LFS dry-run and parsing reliability

  • Improved dry-run parsing and file path handling (including whitespace-heavy paths).
  • Added/updated tests around LFS parser/pointer handling.

Primary areas:

  • lfs/lfs.go
  • lfs/*_test.go

5) Recent PR feedback + CI fixes

  • Delete API clarified to explicit OID vs DID methods.
  • Fixed race/lint failures and testutils config key drift.

Key commits near tip:

  • e7849503 (delete API split + merge preservation)
  • f7455f03 (lint/race-test fixes)

Suggested Review Order (fastest to highest signal)

  1. API/behavior surface
  • client/client.go
  • client/drs/client.go
  • client/local/local_client.go
  1. Critical command paths
  • cmd/addurl/main.go
  • cmd/prepush/main.go
  • cmd/delete/main.go
  • cmd/remote/add/*
  1. Bucket scoping logic
  • gitrepo/bucket_map.go
  • gitrepo/bucket_scope.go
  1. LFS parsing robustness
  • lfs/lfs.go
  • lfs/lfs_parse_test.go
  • lfs/lfs_dryrun_test.go
  1. Then skim remaining test/doc/workflow changes

How to Test

CI-equivalent lint + tests

make lint
go test -v -race $(go list ./... | grep -v '/cmd/addurl$')

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7092ce0a8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread client/drs/client.go Outdated
Comment on lines +220 to +223
func (cl *GitDrsClient) DeleteRecord(ctx context.Context, oid string) error {
// Note: cl.G3.DRSClient().DeleteRecord takes DID, but git-drs often passes OID
// We should probably keep the hash-based delete logic here or move it to core
return cl.G3.DRSClient().DeleteRecord(ctx, oid)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Delete records by DID in the DRS client wrapper

DeleteRecord forwards the provided value directly to cl.G3.DRSClient().DeleteRecord, but this call path is fed SHA256 OIDs by callers (and the comment here already notes the DID/OID mismatch). In the DRS backend, delete is DID-scoped, so deletes triggered from OID-based flows can fail with not-found and leave stale metadata behind; this affects both explicit delete operations and upsert conflict handling that tries to delete before re-registering.

Useful? React with 👍 / 👎.

Comment thread gitrepo/bucket_map.go Outdated
Comment on lines +47 to +48
if prefix != "" {
configs[projectBucketKey(org, project, "prefix")] = prefix
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear stale prefix when updating bucket mappings

When updating an existing mapping with an empty prefix (for example, overwrite without --path), this function writes only the bucket key and never clears the previously stored prefix key. As a result, GetBucketMapping continues returning the old prefix and routing still uses the outdated subpath, which is inconsistent with the new mapping input.

Useful? React with 👍 / 👎.

@bwalsh
Copy link
Copy Markdown
Contributor

bwalsh commented Mar 30, 2026

ADR 0003: Local pre-push metadata staging for DRS/LFS objects

Status

Proposed

Date

2026-03-30

Context

The branch feature/local-drs-server introduces a broader shift in how git-drs prepares and publishes object metadata before git lfs pre-push.

Historically, metadata and object state were primarily derived from direct remote interactions and ad-hoc discovery at push time. That approach made behavior harder to reason about when local state, cache state, and remote records diverged.

This branch introduces three connected changes:

  1. cmd/prepush now prepares object candidates locally and submits metadata to a server endpoint (/info/lfs/objects/metadata) before delegating to git lfs pre-push.
  2. cmd/addurl and precommit_cache are aligned so pre-push can consume path/OID hints already materialized under .git/drs/pre-commit/v1.
  3. Remote URL-backed objects (for example s3://...) are represented via explicit metadata/alias semantics and handled as first-class objects in transfer flows.

This design reduces ambiguity in pre-push behavior and allows controlled server-assisted metadata registration while preserving fallback behavior.

Decision

Adopt a local-first metadata staging pipeline at pre-push time, with cache-aware discovery and server submission as an explicit step.

1) Pre-push stages metadata explicitly

cmd/prepush will:

  • collect LFS object candidates from local state (cache when valid, discovery fallback when not),
  • convert candidates to the metadata payload contract,
  • POST staged metadata to /info/lfs/objects/metadata,
  • degrade gracefully for unsupported endpoints (for example 404/405/501 and selected non-JSON responses),
  • continue with git lfs pre-push after staging/fallback.

2) Cache is a shared contract across add-url, pre-commit, and pre-push

The .git/drs/pre-commit/v1 cache remains the local interchange format:

  • paths/<encoded>.json maps path -> OID,
  • oids/<sha256(oid)>.json maps OID -> paths + URL hint + update markers.

cmd/addurl updates cache entries so objects added via URL can be consumed by pre-push without requiring an intervening pre-commit rewrite.

3) Remote URL-backed objects are modeled explicitly

Objects that are authoritative in remote storage are marked using alias/metadata semantics (for example git-drs-remote-url:true) and are handled in transfer logic as remote-backed records rather than purely local upload artifacts.

Consequences

Positive

  • Pre-push behavior is more deterministic: local candidates are staged in a single server call before transfer.
  • Cache-aware operation reduces repeated work and improves consistency between add-url and pre-push.
  • Remote URL-backed objects can participate in the same workflow without forcing an upload-only model.

Negative / Tradeoffs

  • Pre-push now depends on a metadata endpoint contract; endpoint compatibility must be maintained.
  • More logic exists in hook-time code paths, increasing the need for focused unit tests around fallback and error handling.
  • Cache format and normalization rules become stricter shared contracts between commands.

Alternatives considered

  1. Keep pre-push fully remote-driven with no metadata staging call.
    • Rejected: preserves ambiguity and fragmented behavior when local and remote state differ.
  2. Use only git-lfs native flows and avoid git-drs metadata staging.
    • Rejected: does not satisfy project-specific DRS/index metadata synchronization needs.
  3. Stage metadata only during add-url and skip pre-push staging.
    • Rejected: misses updates from non-add-url paths and pushes created by ordinary Git/LFS workflows.

Implementation notes

Key implementation areas introduced or updated in this branch:

  • cmd/prepush/main.go (candidate collection, metadata submit, fallback handling)
  • cmd/addurl/service.go and cmd/addurl/cache.go (cache alignment for add-url)
  • precommit_cache/helpers.go (cache contract and lookup helpers)
  • cmd/transfer/main.go and related cloud/client flows for remote URL-backed object handling

Known follow-up risks

From current review, follow-up hardening is recommended:

  1. Normalize cached OIDs consistently (strip optional sha256: prefix before lookup usage).
  2. Ensure temporary stdin buffering files are always cleaned up on early errors.
  3. Add dedicated unit tests for metadata staging responses (2xx, degrade statuses, and hard-fail statuses).

@bwalsh
Copy link
Copy Markdown
Contributor

bwalsh commented Mar 30, 2026

PR Review: feature/local-drs-server

Findings (ordered by severity)

P1 - Cache OID is consumed without normalizing sha256: prefix

  • Evidence: Cached OID is copied directly into LfsFileInfo.Oid (Oid: entry.LFSOID) and later used as lookup key.
  • Location: cmd/prepush/main.go:464-469
  • Risk: Repositories with existing cache entries containing sha256:<hex> can miss local DRS object resolution and trigger fallback behavior unexpectedly.
  • Suggested fix: Normalize before assignment (trim whitespace and optional sha256: prefix), then keep only raw hex in LfsFileInfo.Oid.

P2 - bufferStdin leaks temp file on early error paths

  • Evidence: If io.Copy or tmp.Seek fails, function returns without closing/removing tmp.
  • Location: cmd/prepush/main.go:333-346
  • Risk: Repeated hook failures can leave orphaned temp files and leaked descriptors.
  • Suggested fix: Add cleanup for error paths in bufferStdin (close + remove temp file before returning error).

P2 - Missing direct tests for metadata staging request/response behavior

  • Evidence: New network flow in submitPendingLFSMeta has no dedicated unit test coverage for endpoint URL construction, auth header wiring, non-2xx handling, and graceful-degrade branches.
  • Location: cmd/prepush/main.go:244-318, cmd/prepush/main_test.go
  • Risk: Regressions in metadata staging may only surface in integration/production hooks.
  • Suggested fix: Add table-driven tests with httptest.Server covering 2xx, 404/405/501 degrade, HTML response degrade, and hard-fail statuses (e.g., 401/500).

Verification Performed

  • Ran full suite: go test ./... (passes on this branch in the current workspace).

Notes

  • No current P0 compile/test blockers were found in this pass.
  • Review focused on changed branch content and highest-risk runtime paths (prepush, metadata staging, cache integration).

Copy link
Copy Markdown
Contributor

@bwalsh bwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f7455f03bd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +233 to +236
uploadReq := common.FileUploadRequestObject{
SourcePath: filePath,
ObjectKey: filepath.Base(filePath),
GUID: obj.Id,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Skip uploading pointer payloads in local RegisterFile

This path always uploads filePath directly when a bucket is configured, but BatchSyncForPush calls RegisterFile for LFS pointer entries too. In local-server flows that use add-url sentinel pointers (or any pointer-only working-tree file), this uploads the pointer text instead of the real object bytes, so the stored payload can be incorrect even though registration succeeds. Add pointer/sentinel source resolution (like the DRS client’s resolveUploadSourcePath) before building the upload request.

Useful? React with 👍 / 👎.

Comment thread cmd/addurl/service.go
Comment on lines 160 to 161
if input.sha256 != "" {
return input.sha256, nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize user SHA input before returning it as OID

When --sha256 is provided, this returns the raw flag value without normalization/validation. If users pass a common prefixed value like sha256:<hex> (or any malformed token), the pointer writer later prefixes again (oid sha256:%s), producing invalid LFS OIDs (e.g. sha256:sha256:<hex>) that downstream LFS parsing will not treat as valid object hashes. Normalize to plain 64-hex and reject invalid input before returning.

Useful? React with 👍 / 👎.

@matthewpeterkort matthewpeterkort requested a review from bwalsh April 8, 2026 16:24
@matthewpeterkort
Copy link
Copy Markdown
Contributor Author

I believe I have addressed feedbacks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants