Skip to content

Comments

CP-37371: bundle Prometheus binary in CloudZero Agent image#648

Open
evan-cz wants to merge 2 commits intodevelopfrom
CP-37371
Open

CP-37371: bundle Prometheus binary in CloudZero Agent image#648
evan-cz wants to merge 2 commits intodevelopfrom
CP-37371

Conversation

@evan-cz
Copy link
Contributor

@evan-cz evan-cz commented Feb 3, 2026

Official Prometheus images include a shell and utilities like curl that expand the tools available for exploitation in container escape or privilege escalation attacks in the event of a compromise. This change bundles the Prometheus binary directly into the CloudZero Agent image, which uses a minimal scratch base with no shell, to provide a bit of defense in depth.

Functional Change:

Before: The Prometheus container used the official quay.io/prometheus/prometheus image, which includes a shell and utilities that could be leveraged by an attacker.

After: The Prometheus container uses the CloudZero Agent image with the Prometheus binary bundled at /app/prometheus. The image has no shell and minimal attack surface. Users can still override to use the official image if needed.

Solution:

  1. Modified docker/Dockerfile to extract and bundle the Prometheus binary:

    • Added prometheus-source build stage from official Prometheus image
    • Copy /bin/prometheus to /app/prometheus in final scratch-based image
    • Prometheus version pinned via PROMETHEUS_VERSION ARG for Renovate updates
  2. Updated helm/values.yaml to use bundled binary by default:

    • Set components.prometheus.image.repository to null (falls back to agent image)
    • Added components.prometheus.command field with three-way behavior:
      • null (default): Uses /app/prometheus for bundled binary
      • []: Uses image default entrypoint (for official Prometheus image)
      • ["/custom/path"]: Uses specified command
  3. Added generateContainerCommand helper to helm/templates/_helpers.tpl:

    • Accepts dict with command and default (array) parameters
    • Reusable for other containers (e.g., Alloy in future PR)
  4. Updated agent-deploy.yaml and agent-daemonset.yaml to use new helper

  5. Added schema validation for command field in helm/values.schema.yaml:

    • Uses oneOf to allow null or array of strings
    • References Kubernetes Container command definition
  6. Updated helm/tests/README.md with correct instructions for running tests

Validation:

  • Added helm/tests/prometheus_command_test.yaml with 7 unit tests covering:

    • Default command (/app/prometheus)
    • Empty array (no command, uses image entrypoint)
    • Custom command with multiple elements
    • Federated mode (daemonset) behavior
    • Args not affected by command changes
  • Added 6 schema validation tests in tests/helm/schema/prometheus.command.*.yaml

  • All helm test suites pass

  • Deployed and verified on EKS (arm64) / GKE (amd64) clusters:

    • Pod running successfully with CloudZero agent image
    • Command correctly set to ["/app/prometheus"]
    • Prometheus scraping and forwarding cAdvisor metrics normally
    • Official image override: Verified backward compatibility with
      quay.io/prometheus/prometheus using command: [] override

dependabot bot and others added 2 commits February 3, 2026 08:26
Bumps golang from 1.25.5-alpine to 1.25.6-alpine.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.25.6-alpine
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Official Prometheus images include a shell and utilities like curl that expand
the tools available for exploitation in container escape or privilege escalation
attacks in the event of a compromise. This change bundles the Prometheus binary
directly into the CloudZero Agent image, which uses a minimal scratch base with
no shell, to provide a bit of defense in depth.

Functional Change:

Before: The Prometheus container used the official `quay.io/prometheus/prometheus`
image, which includes a shell and utilities that could be leveraged by an attacker.

After: The Prometheus container uses the CloudZero Agent image with the Prometheus
binary bundled at `/app/prometheus`. The image has no shell and minimal attack
surface. Users can still override to use the official image if needed.

Solution:

1. Modified docker/Dockerfile to extract and bundle the Prometheus binary:
   - Added `prometheus-source` build stage from official Prometheus image
   - Copy `/bin/prometheus` to `/app/prometheus` in final scratch-based image
   - Prometheus version pinned via `PROMETHEUS_VERSION` ARG for Renovate updates

2. Updated helm/values.yaml to use bundled binary by default:
   - Set `components.prometheus.image.repository` to null (falls back to agent image)
   - Added `components.prometheus.command` field with three-way behavior:
     - null (default): Uses `/app/prometheus` for bundled binary
     - []: Uses image default entrypoint (for official Prometheus image)
     - ["/custom/path"]: Uses specified command

3. Added `generateContainerCommand` helper to helm/templates/\_helpers.tpl:
   - Accepts dict with `command` and `default` (array) parameters
   - Reusable for other containers (e.g., Alloy in future PR)

4. Updated agent-deploy.yaml and agent-daemonset.yaml to use new helper

5. Added schema validation for command field in helm/values.schema.yaml:
   - Uses oneOf to allow null or array of strings
   - References Kubernetes Container command definition

6. Updated helm/tests/README.md with correct instructions for running tests

Validation:

- Added helm/tests/prometheus_command_test.yaml with 7 unit tests covering:
  - Default command (/app/prometheus)
  - Empty array (no command, uses image entrypoint)
  - Custom command with multiple elements
  - Federated mode (daemonset) behavior
  - Args not affected by command changes

- Added 6 schema validation tests in tests/helm/schema/prometheus.command.\*.yaml

- All helm test suites pass

- Deployed and verified on EKS (arm64) / GKE (amd64) clusters:
  - Pod running successfully with CloudZero agent image
  - Command correctly set to ["/app/prometheus"]
  - Prometheus scraping and forwarding cAdvisor metrics normally
  - Official image override: Verified backward compatibility with
    quay.io/prometheus/prometheus using command: [] override
@evan-cz evan-cz requested a review from a team as a code owner February 3, 2026 13:30
@evan-cz evan-cz changed the base branch from develop to dependabot/docker/docker/golang-1.25.6-alpine February 3, 2026 13:30
Base automatically changed from dependabot/docker/docker/golang-1.25.6-alpine to develop February 3, 2026 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant