You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The client then uploads directly to object storage using returned URLs.
Architecture
Git Client
│
│ HTTP
▼
LFS Batch API Server
│
│ resolves auth binding
▼
DRS Control Plane
│
▼
Object Store
Supported Use Cases
Use Case
Supported
Multi-user collaboration
✅
Hosted Git platforms
✅
Enterprise SSO / OAuth
✅
Centralized policy enforcement
✅
Auditable batch requests
✅
Transparent client UX
✅
Repo-scoped storage policies
✅
SaaS deployment model
✅
Characteristics
Fully compatible with stock Git LFS clients
No client customization required
Centralized policy and credential resolution
Aligns with DRS access-time credential minting
Enables fine-grained repo-level isolation
Comparison
Dimension
Custom Transfer Agent
Batch API Server
Requires LFS server
❌
✅
Requires client modification
✅
❌
Works with GitHub/GitLab
❌
✅
Centralized authorization
❌
✅
Repo-scoped isolation
Weak
Strong
Multi-tenant SaaS
❌
✅
Air-gapped research
✅
Possible
Operational simplicity
Client-heavy
Server-heavy
Auditable batch control
Limited
Strong
GA4GH alignment
Medium
High
Add URL / External Object Registration Semantics
Overview
The “Add URL” workflow allows registration of a pre-existing external object
without re-uploading it.
Supports:
Large datasets already stored in cloud buckets
Cross-project reuse
Federated research workflows
Definitions
No User Upload
The client does not upload object bytes.
No Transfer
No object bytes are transferred at all (server already knows sha256).
Mode A — URL + sha256 + size
sequenceDiagram
participant U as User
participant C as Git Client
participant S as LFS/DRS Server
participant O as Object Store
U->>C: git lfs add-url <url> --sha256 --size
C->>O: HEAD <url>
C->>S: Verify sha256 exists
S-->>C: Exists
C->>C: Write pointer file
Loading
Transfer semantics:
No user upload
No transfer (if already indexed)
Mode B — URL Only
sequenceDiagram
participant U as User
participant C as Git Client
participant S as LFS/DRS Server
participant O as Object Store
U->>C: git lfs add-url <url>
C->>O: HEAD <url>
C->>S: Resolve sha256
alt Known
S-->>C: sha256 + size
else Unknown
S->>O: GET object (ingest)
S-->>C: sha256 computed
end
C->>C: Write pointer file
Loading
Transfer semantics:
No user upload
Server-side transfer may occur
Error Cases
Condition
Error
Size mismatch
SIZE_MISMATCH
Checksum mismatch
CHECKSUM_MISMATCH
Unstable URL
UNSTABLE_OBJECT_SOURCE
Object modified after registration
IMMUTABILITY_VIOLATION
Source not accessible
SOURCE_NOT_ACCESSIBLE
Architectural Comparison
Dimension
Custom Agent
Batch API Server
Hosted Git Compatible
No
Yes
Centralized Policy
No
Yes
Multi-tenant SaaS
No
Yes
Auditability
No
Yes
Safe Federated Add-URL
No
Yes
Immutability Enforcement
Weak
Strong
Critical Gap: What Is NOT Supported If Only Custom Transfer Agent Is Used
If we rely solely on a custom transfer agent:
Hosted Git integration is impossible
GitHub and GitLab do not execute custom agents.
Users cannot push from standard environments.
Multi-user standardization is fragile
Every collaborator must install and configure the agent.
Version drift causes inconsistencies.
No centralized policy enforcement
Bucket and authorization logic lives on client machines.
Hard to enforce repo-specific storage controls.
No transparent federation
External collaborators cannot push without installing software.
Violates "it just works with Git" principle.
Reduced security posture
More credential resolution logic distributed to clients.
Harder to audit access centrally.
No web or CI integration
CI/CD systems require custom agent install.
Web-based file operations cannot use it.
When Custom Transfer Agent Is Appropriate
Research-only deployments
Internal platform experiments
Developer tooling
Transitional architecture
Air-gapped or highly controlled environments
When Batch API Server Is Required
Production multi-tenant environments
Public Git hosting compatibility
Enterprise SSO integration
GA4GH federated ecosystems
DRS-backed storage federation at scale
Discussion
For a federated, multi-tenant, GA4GH-aligned system:
A Git LFS Batch API server is required.
The custom transfer agent may remain as a complementary tool but cannot be the sole integration surface.
ADR: Git LFS Integration Strategy — Custom Transfer Agent vs Batch API Server
Status
Discussion
Background
We are integrating Git LFS with a DRS-backed storage system that supports:
Git LFS supports two fundamentally different integration models. There are two primary integration surfaces in Git LFS:
These are fundamentally different architectural patterns and support different use cases.
Both can interact with DRS-backed storage, but they differ significantly in:
Decision Drivers
This ADR documents the architectural differences, supported use cases, trade-offs,
and the “Add URL” (external object registration) workflow.
Architecture Overview
Custom Transfer Agent
flowchart LR A[Git Client] --> B[Custom Transfer Agent] B --> C[Object Store] B --> D[DRS APIs]Option 1 — Git LFS Custom Transfer Agent (Client-Side)
Description
A custom transfer agent is configured in the Git client via:
Instead of contacting an LFS server, the Git LFS client:
The agent is responsible for:
No LFS Batch API server is involved.
Architecture
Supported Use Cases
Characteristics
Limitations
Not Supported or Problematic
2. LFS Batch API Server (Server-Side)
Description
A standard Git LFS server implements:
The client sends:
{ "operation": "upload", "objects": [ { "oid": "...", "size": 123 } ] }The server responds with per-object
actions:{ "objects": [ { "oid": "...", "actions": { "upload": { "href": "https://signed-url", "header": { ... } } } } ] }The client then uploads directly to object storage using returned URLs.
Architecture
Supported Use Cases
Characteristics
Comparison
Add URL / External Object Registration Semantics
Overview
The “Add URL” workflow allows registration of a pre-existing external object
without re-uploading it.
Supports:
Definitions
No User Upload
The client does not upload object bytes.
No Transfer
No object bytes are transferred at all (server already knows sha256).
Mode A — URL + sha256 + size
sequenceDiagram participant U as User participant C as Git Client participant S as LFS/DRS Server participant O as Object Store U->>C: git lfs add-url <url> --sha256 --size C->>O: HEAD <url> C->>S: Verify sha256 exists S-->>C: Exists C->>C: Write pointer fileTransfer semantics:
Mode B — URL Only
sequenceDiagram participant U as User participant C as Git Client participant S as LFS/DRS Server participant O as Object Store U->>C: git lfs add-url <url> C->>O: HEAD <url> C->>S: Resolve sha256 alt Known S-->>C: sha256 + size else Unknown S->>O: GET object (ingest) S-->>C: sha256 computed end C->>C: Write pointer fileTransfer semantics:
Error Cases
Architectural Comparison
Critical Gap: What Is NOT Supported If Only Custom Transfer Agent Is Used
If we rely solely on a custom transfer agent:
Hosted Git integration is impossible
Multi-user standardization is fragile
No centralized policy enforcement
No transparent federation
Reduced security posture
No web or CI integration
When Custom Transfer Agent Is Appropriate
When Batch API Server Is Required
Discussion
For a federated, multi-tenant, GA4GH-aligned system:
The custom transfer agent may remain as a complementary tool but cannot be the sole integration surface.