feat: archive build/deploy logs to MinIO for post-eviction retrieval#119
Merged
vigneshrajsb merged 7 commits intomainfrom Mar 3, 2026
Merged
feat: archive build/deploy logs to MinIO for post-eviction retrieval#119vigneshrajsb merged 7 commits intomainfrom
vigneshrajsb merged 7 commits intomainfrom
Conversation
Adds a pino formatters.level option so logs include string severity labels (e.g. "level":"info") rather than numeric codes (e.g. "level":30). This fixes log severity mapping in Groundcover. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Build and deploy job logs are permanently lost once k8s Job pods are
evicted or TTL-expired (~24h). This adds MinIO as an optional in-cluster
object store to archive logs at completion time, serving them back to the
UI even after the live pods are gone.
## New files
- src/server/lib/objectStore/s3Client.ts
MinIO client singleton configured via MINIO_* env vars
- src/server/services/logArchival.ts
LogArchivalService with archiveLogs, getArchivedLogs, getArchivedMetadata,
listArchivedJobs, ensureBucket, configureRetention
- src/server/services/types/logArchival.ts
ArchivedJobMetadata interface
## Modified files
- src/shared/config.ts / next.config.js
Export MINIO_ENDPOINT, MINIO_PORT, MINIO_ACCESS_KEY, MINIO_SECRET_KEY,
MINIO_BUCKET, MINIO_USE_SSL (all with safe defaults)
- src/server/services/types/globalConfig.ts
Add logArchival?: { enabled: boolean; retentionDays: number } to GlobalConfig
- src/server/services/types/logStreaming.ts
Add 'Archived' to status union; add archivedLogs?: string field
- src/server/lib/nativeBuild/engines.ts
After waitForJobAndGetLogs(), archive logs when logArchival.enabled=true
Both success and error paths are covered
- src/server/lib/nativeHelm/helm.ts
Same pattern for native Helm deploy jobs
- src/server/lib/kubernetes/getNativeBuildJobs.ts
Merge archived build jobs (not present in live k8s) into the listing
Add source?: 'live' | 'archived' field to BuildJobInfo
- src/server/lib/kubernetes/getDeploymentJobs.ts
Same for deploy jobs / DeploymentJobInfo
- src/server/services/logStreaming.ts
When k8s returns NotFound, attempt archived log lookup before returning
NotFound. Returns status='Archived' with archivedLogs when found.
- helm/web-app/Chart.yaml + helm/environments/local/lifecycle.yaml
Add minio subchart dependency (disabled by default in local values)
## Storage schema
lifecycle-logs/
{namespace}/{jobType}/{serviceName}/{jobName}/
logs.txt - full log content
metadata.json - job info (status, duration, sha, engine, timestamps)
## Enabling
All archival ops are gated on globalConfig.logArchival.enabled.
Insert into global_config to activate:
{ "logArchival": { "enabled": true, "retentionDays": 14 } }
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix JobMonitor log ordering: wait for job completion before fetching logs so the full output is captured rather than a mid-run snapshot - Add startedAt/completedAt/duration to JobMonitor.getJobStatus via kubectl job JSON, thread timing through engines.ts and helm.ts so archived metadata has accurate timestamps - Upgrade live k8s jobs with no pod to source='archived' when an archive exists in MinIO, so they remain selectable in the UI - Extend logStreaming archived fallback to also trigger when the k8s job exists but its pod has been cleaned up (!podInfo.podName) - Add source field to NativeBuildJobInfo OpenAPI schema - Add MinIO helm_resource to Tiltfile; remove erroneous minio subchart dependency from helm/web-app/Chart.yaml - Add ALLOWED_ORIGINS to local lifecycle.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- logArchival: lazy ensureBucket on first write, null-safe S3 Body reads, paginated ListObjectsV2 for archived job listing - getDeploymentJobs: port podless-to-archived source upgrade logic matching build job behaviour - openApiSpec + API route schemas: add source and archivedLogs fields, fix LogStreamResponse required field list, correct status enum values - 001_seed: add logArchival feature-flag row to globalConfig seed - globalConfig types: drop unused retentionDays field - logs/[jobName] API: cast type query param to LogType to satisfy TS Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add @aws-sdk/client-s3 dependency and s3Client singleton - Wire OBJECT_STORE_* env vars through config.ts and next.config.js serverRuntimeConfig - Configure local dev MinIO env vars in helm/environments/local/lifecycle.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Prevents warning noise on fresh installs where MinIO is not yet configured. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
vmelikyan
approved these changes
Mar 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Build and deploy job logs are permanently lost once k8s Job pods are evicted or TTL-expired (~24h):
getNativeBuildJobs,getDeploymentJobs)Solution
Add MinIO as an optional in-cluster S3-compatible object store. Logs are archived at job completion time and served back transparently — the UI sees a new `Archived` status instead of `NotFound`.
Architecture
```
Job completes → archive logs.txt + metadata.json to MinIO
└── {namespace}/{jobType}/{serviceName}/{jobName}/
Pod evicted after TTL...
UI requests log stream info → backend returns status='Archived' + archivedLogs text
UI requests job list → archived jobs merged into live k8s results (deduplicated by jobName)
```
Changes
New files
Modified files (lifecycle backend)
Related PRs
Key design decisions
Feature-gated: all object store calls check `globalConfig.logArchival?.enabled`. Seeded as `false` — enabling requires an explicit DB update. Deploying the infra (MinIO pod) is safe before enabling the flag.
Non-blocking: archival failures are caught and logged as warnings — they never fail the build/deploy flow.
Deduplication: merged archived jobs are deduplicated by `jobName` against live k8s results, so a completing job never appears twice.
S3 support: set `OBJECT_STORE_TYPE=s3` to use AWS S3 with IRSA — no credentials in config. Bucket must be pre-provisioned.
Enabling
```json
{ "logArchival": { "enabled": true } }
```
Test plan
🤖 Generated with Claude Code