- Project: Project Coherent Storage
- Architecture cycle: 2026-Q2
- Architecture focus:
- Auto-Scaling Ai/HPC storage architecture featuring accelerator-centric Coherent Memory-Mesh
- Custom Max-IO Grid-Engine w/ ACID-compliant cache transactions for superscaler architectures
- Custom UA-Link pod-scale systems design with host-based CXL memory pools to clear the 'memory wall'
- Fully automated deployment w/ Ansible workflows, SLURM workload management, netboot ramdisks
- Dev/Lab, Stage/LT, Prod env support with full CI/CD test-coverage, load-test profiles, ITIL change-controls
- Gate-based workflows with 'failure-semantics', SLO & SLA definitions for observability and monitoring
- Network environment scaling from 10-25Gb/s to 400-800Gb/s port-based tuning profiles for RDMA/RoCEv2
- NVMe-oF with DPU hardware-based protocol offloads for OpenZFS (storage tiering + ACLs + DoD compliant encryption)
- LLM Prompt-Cache acceleration supported via disaggregated heterogeneous GP-GPU compute (AMD, NVIDIA, NPU, FPGA)
- Generated: 2026-05-18
- Automation: Tracked workflows, machine profiles, neteng scopes, etc located in 'Infra-Stage4-LLVM-NoGNU' repository.
- Status: Proposed / Review
This package refreshes the ADR set using the expanded RAG corpus and the project directives. It keeps the core invariant: inference actors connect to the Coherence-CE Memory Mesh and never bind directly to OpenZFS, DPU, RoCEv2, NVMe-oF, CXL, UA-Link, VLANs, RDMA memory handles, or physical storage internals.
The architecture emphasizes:
- Coherence-CE namespace modalities with explicit Unified Namespace and Dimensional Indexed Namespace workflows for scalable cache locality.
- UA-Link enabled pod-scale systems as a scale-up accelerator domain inside pod/rack boundaries.
- Network architecture across scale-up and scale-out planes, separating UA-Link accelerator fabrics, Ethernet/RDMA scale-out, storage/NVMe-oF fabrics, management, telemetry, and timing.
- CXL memory pools as governed T1/T1.5 memory capacity for warm KV/prefix state, metadata, vector heads, and future shared-memory research paths.
- RDMA/RoCEv2 performance tuning with explicit PFC/ECN/DCQCN, traffic-class, rail, telemetry, and failure semantics.
- DPU/SmartNIC storage offload as a hard requirement for NVMe-oF/RDMA storage-network paths.
- General-purpose GPU and heterogeneous accelerator scheduling, covering vendor capability profiles and admission-control policy.
- Reference Architecture Focused Development, baseline scoped architecture elements suitable for layering, adapting, and ease of feature adoption as the industry rapidly evolves; fully open-source across the entire application stack.
The source pass extracted text from 363 PDFs in the RAG-DATA/ corpus into a local processing cache.
- Text extraction OK: 360 PDFs
- Source map:
review-artifacts/rag-extraction-and-source-map.md
Important sources include the UA-Link white paper, UniFabriX UA-Link material, OCP Open Cluster inference/training fabric reference architectures, OCP MRC, Arista/Broadcom lossless Ethernet/RoCE material, AMD Pensando/Pollara cluster and product collateral, Intel Gaudi 3 cluster design, CXL/KV/GPU research, and prior Marvell/XConn/CXL/DPU materials.
| Path | Purpose |
|---|---|
reports/project-coherent-storage_architecture-report.md |
Main architecture report for UA-Link pod scale, CXL memory pools, RDMA/RoCEv2, and heterogeneous GP-GPU compute. |
reports/project-coherent-storage_engineering-deep-dive.md |
Top-down engineering deep-dive from OpenAI/user layer through global/regional/datacenter load-balancer meshes and intra-datacenter storage layers. |
reports/project-coherent-storage_overview__executive-overview.md |
Executive overview for business value, hard requirements, namespace posture, and residual risks. |
reports/project-coherent-storage_overview__director-overview.md |
Director overview for procurement, lifecycle, deployment risk, and operational readiness. |
reports/project-coherent-storage_overview__engineering-overview.md |
Engineering/ARB overview for data paths, CXL roles, namespace rules, and validation checklist. |
reports/project-coherent-storage_s3-object-rest-api-translator-design.md |
Translator design report for S3/Object REST access and explicit prefix-cache namespace modalities. |
reports/project-coherent-storage_coherence-ce-object-chunking-and-lfs-gateway-design.md |
Design report for Coherence-CE object chunking, manifest semantics, and Git LFS gateway migration. |
api/coherence-ce-vllm-adapter.openapi.yaml |
OpenAPI contract for Coherence-CE vLLM adapter operations. |
api/s3-object-rest-translator.openapi.yaml |
OpenAPI contract for S3/Object REST translator routes, including Unified and Dimensional Indexed Namespace routes. |
api/coherence-ce-object-chunking-lfs-gateway.openapi.yaml |
OpenAPI contract for Coherence-native object chunking and Git LFS gateway facade routes. |
adr/diagrams/*.puml, *.png, *.svg |
Per-ADR PlantUML source and rendered PNG/SVG assets. |
diagrams/*.puml, *.png, *.svg |
Report-level PlantUML source and rendered PNG/SVG assets. |
review-artifacts/rag-extraction-and-source-map.md and JSON peer |
Extraction evidence and source map. |
review-artifacts/ietf-icnrg-chunking-source-map.md and JSON peer |
Source map for CCNx chunking, FLIC, RFC 8569/8609, and Git LFS API references. |
docs/git-lfs-policy.md |
Repository Git LFS lock-verification, normalized .gitattributes, pre-push hook, test-server, and migration policy. |
| ADR File | Document Function |
|---|---|
ADR-001_Inference_Storage_Principles_and_SLOs.md |
Defines inference-first storage principles, latency SLOs, tier boundaries, and workload classes that govern all later ADRs. |
ADR-002_Hot_KV_and_Prefix_Cache_Data_Plane.md |
Defines the hot KV/prefix-cache data plane and keeps inference actors behind the Coherence-CE Memory Mesh. |
ADR-003_Model_Weight_Object_and_Corpus_Data_Tiers.md |
Defines model-weight, adapter, tokenizer, object, corpus, and artifact tiers for reproducible inference data placement. |
ADR-004_RDMA_Fabric_and_GPU_Direct_Data_Paths.md |
Defines RDMA, RoCEv2, GPU-direct, and scale-out data-path rules for cross-node inference and storage movement. |
ADR-005_DPU_and_SmartNIC_Offload_Boundaries.md |
Defines mandatory DPU/SmartNIC offload boundaries for NVMe-oF, RDMA mediation, isolation, telemetry, and degraded host fallback. |
ADR-006_OpenZFS_NVMe_oF_and_Media_Layout.md |
Defines OpenZFS, NVMe-oF, mirrored NAND, media layout, and durable block-substrate rules. |
ADR-007_Inference_Scheduler_Locality_and_Admission_Control.md |
Defines scheduler admission using model, KV, fabric, CXL, DPU, rail, and locality telemetry. |
ADR-008_RAG_Vector_Index_and_Corpus_Service.md |
Defines immutable RAG corpus, embedding, vector-index, retrieval-cache, and corpus-service architecture. |
ADR-009_Observability_Benchmarking_and_Rollout_Gates.md |
Defines observability, benchmark, failure-drill, and rollout gates for inference, fabric, storage, CXL, and scheduler claims. |
ADR-010_Coherence_CE_Write_Policy_to_OpenZFS.md |
Defines Coherence-CE write-through, write-back, write-around, and write-behind policy to OpenZFS by durability class. |
ADR-011_KV_Durability_Classes.md |
Defines KV-D0 through KV-D5 durability classes used by Coherence-CE, OpenZFS write policy, failure recovery, and scheduler admission. |
ADR-012_Coherence_CE_vLLM_Adapter_API_Contract.md |
Defines the Coherence-native and OpenAI-compatible API contract exposed to vLLM adapters without leaking lower-layer storage or fabric. |
ADR-013_Failure_Semantics_and_Fencing.md |
Defines failure semantics, fencing, recovery, drain behavior, and degraded-mode rules across compute, fabric, DPU, CXL, and storage. |
ADR-014_Coherence_Metrics_Scheduler_Admission.md |
Defines how Coherence-CE metrics roll up into scheduler GREEN, AMBER, RED, and DRAIN admission states. |
ADR-015_CXL_Memory_Tiering_and_OpenZFS_Interaction.md |
Defines CXL T1/T1.5 memory tiering, memory-pool governance, and safe OpenZFS-adjacent CXL roles. |
ADR-016_Roadmap_Evidence_and_Public_Claim_Guardrails.md |
Defines evidence grades and public-claim guardrails for vendor roadmap, partnership, and integration statements. |
ADR-017_Research_Metadata_and_Arxiv_Publication_Workflow.md |
Defines research metadata, arXiv API/bulk-data, Markdown, LaTeX, BibTeX, and publication workflow requirements. |
ADR-018_UALink_Pod_Scale_Fabric_and_Compute_Domains.md |
Defines UA-Link pod-scale accelerator fabric domains and their scheduler-visible but actor-hidden compute locality semantics. |
ADR-019_Pod_Scale_Network_Architecture_and_RDMA_RoCEv2_Tuning.md |
Defines pod-scale network planes and RDMA/RoCEv2 tuning gates for traffic classes, PFC, ECN/DCQCN, rails, and telemetry. |
ADR-020_CXL_Memory_Pools_for_UALink_Pods.md |
Defines CXL memory pools inside UA-Link pods as governed Coherence-owned warm capacity with ownership, latency, and failure gates. |
ADR-021_Heterogeneous_GP_GPU_Compute_and_Scheduler_Governance.md |
Defines heterogeneous GP-GPU and accelerator capability profiles for scheduler governance across vendors and fabrics. |
ADR-022_S3_Object_to_REST_API_Protocol_Mapping_Translator.md |
Defines the S3/Object-to-REST translator and its object, KV, vector, and prefix-cache REST contract. |
ADR-023_Coherence_CE_Namespace_Modalities.md |
Defines Unified Namespace and Dimensional Indexed Namespace workflows, API route semantics, and locality-governance rules. |
ADR-024_System_Level_Benchmarking_Suite_Definitions.md |
Defines system-level benchmark suite taxonomy across component, service, test-intent, SLURM execution, cross-platform tooling, and evidence gates. |
ADR-025_Broad_Systems_E2E_Testing_Workflows_and_Tooling.md |
Defines broad-systems E2E testing workflows, scheduler-adapter execution, failure-mode tests, evidence bundles, and CI/CD gates. |
ADR-026_Coherence_CE_Object_Chunking_and_Manifest_Semantics.md |
Defines Coherence-CE internal object chunking, manifest commit semantics, S3 multipart mapping, Git LFS facade behavior, and RAG byte-object boundaries. |
The design composes the system from inference SLOs down through hot-state placement, namespace modality, data tiers, fabrics, offload, durable media, scheduler admission, failure semantics, CXL/UA-Link pod resources, heterogeneous accelerator governance, S3/Object REST translation, object chunking and manifest semantics, Git LFS gateway behavior, benchmark evidence, broad-systems E2ET, and research-publication workflow. Each ADR embeds its PNG diagram and has a PlantUML source file plus PNG/SVG renders under adr/diagrams/.
UA-Link, CXL, RoCEv2, DPU, and heterogeneous GPU claims use the evidence-grade rule structures:
- Direct: source explicitly states the relationship or capability.
- Adjacent: relevant to architecture but not proof of a named integration.
- Negative-control: retained to prevent overclaiming.
- Not found in current sweep: searched but no direct source-backed mention found.



























