A compose-of-compositions blueprint that installs the entire Krateo PlatformOps platform
from one helm install. The umbrella self-bootstraps the composition engine, registers
itself as an Installer composition, and then self-reconciles: it registers each component's
CompositionDefinition (Pass A) and emits each Composition once its dependencies are
Ready and its CRD exists (Pass B), resolving exposure (service.type) and the portal config
(peer LoadBalancer IPs) by reconciliation — no prerequisite scripts, no post-install patching.
- Chart:
oci://ghcr.io/braghettos/charts/installer - Install guide: see QUICKSTART.md — kind (local) and managed GKE.
- Kind:
Installer(composition.krateo.io). - Expert agent: a kagent
Agentthat knows this blueprint — see kagent/.
helm install installer oci://ghcr.io/braghettos/charts/installer --version 0.2.53 \
-n krateo-system --create-namespace --set exposure.type=LoadBalancer --wait
# tear the whole platform down (ordered, finalizer-safe, no manual cleanup):
helm uninstall installer -n krateo-systemThe umbrella chart renders differently depending on bootstrap.coreProvider.enabled, which
is the seam between "a plain Helm install" and "the Krateo composition engine":
Bootstrap mode (bootstrap.coreProvider.enabled: true, the helm install) |
Composition mode (false, core-provider re-rendering the Installer CR) |
|
|---|---|---|
self-bootstrap.yaml |
✅ renders — installer CompositionDefinition + RBAC + post-install hook |
— |
subchart deps (Chart.yaml) |
✅ installed — core-provider, cert-manager, clickhouse-op, mongodb-op | — |
definitions.yaml (Pass A) |
— | ✅ one CompositionDefinition per enabled component |
compositions.yaml (Pass B) |
— | ✅ one Composition per component (gated on CRD + deps Ready) |
secret.yaml |
— | ✅ component secrets (e.g. gemini key) |
| teardown hooks | bootstrap-teardown (pre-delete) + post-delete-cleanup |
ordered-teardown (pre-delete) |
One helm install runs bootstrap mode. core-provider then reconciles the Installer CR by
re-rendering this same chart in composition mode on its reconcile loop — that is the
self-reconcile that advances the rollout with no helm upgrade/up.sh.
stateDiagram-v2
direction TB
[*] --> Bootstrapping: helm install (bootstrap mode)
note right of Bootstrapping
Helm pre-installs subchart crds/ (compositiondefinitions CRD),
installs engine + operator subcharts (core-provider, cert-manager,
clickhouse-op, mongodb-op) and renders self-bootstrap.yaml:
the installer CompositionDefinition + RBAC + a post-install hook Job.
end note
Bootstrapping --> AwaitingInstallerCRD: core-provider reconciles the installer CompositionDefinition
note right of AwaitingInstallerCRD
post-install hook Job blocks until core-provider has
generated installers.composition.krateo.io, then applies
the Installer CR (spec = picked values, bootstrap OFF).
end note
AwaitingInstallerCRD --> PassA: core-provider reconciles the Installer CR (re-renders in composition mode)
state "Self-reconcile loop" as Loop {
PassA: Pass A - register CompositionDefinitions
PassB: Pass B - emit Compositions
PassA --> PassB: per component, CRD generated AND deps Ready=True
PassB --> PassA: next reconcile re-renders, more components unlock
}
PassB --> Ready: all enabled components Ready=True (exposure + portal config wired via lookup)
Ready --> PassA: every reconcile re-renders (drift correction / version propagation)
Ready --> Draining: helm uninstall installer
Bootstrapping --> Draining: helm uninstall installer
note right of Draining
pre-delete hooks, controllers ALIVE.
HOOK 2 bootstrap-teardown: delete the installer CompositionDefinition;
core-provider cascades - deletes the Installer CR, cdc uninstalls the
composition release, fires HOOK 1.
HOOK 1 ordered-teardown: delete component Compositions in REVERSE
dependency order (oasgen-provider gated behind RestDefinitions).
Block until the whole footprint drains while controllers clear finalizers.
end note
Draining --> Sweeping: footprint drained, helm removes core-provider and scaffolding
note right of Sweeping
post-delete hook, controllers GONE.
HOOK 3 post-delete-cleanup: remove runtime-created, non-helm-owned
leftovers core-provider/oasgen can no longer recreate - the
core-provider MutatingWebhookConfiguration + generated
*.hyperdx.krateo.io CRDs.
end note
Sweeping --> [*]: bare (inherent helm residue only - namespace, PVCs, crds/-dir CRDs)
Krateo cleans up via finalizers only a live controller can clear, but helm uninstall
gives no ordering guarantee that a finalizing controller outlives what it finalizes. The fix
uses one Helm property — all pre-delete hooks run to completion before any normal resource
is deleted, so controllers are still alive inside them — plus a post-delete hook for what's
left once they're gone:
ordered-teardown.yaml(pre-delete, composition release) — deletes componentCompositionCRs in reverse dependency order (consumers before providers), so the portal drains beforefrontend-crd/authn-crd/snowplow-crdandhyperdx-providerdrains beforeoasgen-provider. Fixes the portal/demo-systemwedge and the RestDefinition orphan.bootstrap-teardown.yaml(pre-delete, bootstrap release) — deletes the top-levelinstallerCompositionDefinition and blocks until the whole footprint drains while core-provider is alive (so it GCs the generated CRDs and clears the installer CD finalizer). Fixes the bootstrap finalizer deadlock and the orphaned per-composition cdc Deployment.post-delete-cleanup.yaml(post-delete, bootstrap release) — sweeps runtime-created, non-helm-owned resources core-provider/oasgen left behind (the core-providerMutatingWebhookConfiguration, generated*.hyperdx.krateo.ioCRDs) so a subsequent reinstall does not crashloop.
The components list in values.yaml is topologically sorted (dependencies before
dependents). Pass B emits each Composition only once every entry in its deps reports
Ready=True; teardown walks the reverse of this order.
flowchart LR
subgraph platform["platform (composable portal)"]
authncrd[authn-crd] --> authn
snowplowcrd[snowplow-crd] --> snowplow
frontendcrd[frontend-crd] --> frontend
authn --> frontend
snowplow --> frontend
frontend --> portal
end
subgraph obs["observability"]
clickstack[krateo-clickstack] --> oteld[otel-collector-deployment]
oteld --> otelds[otel-collector-daemonset]
clickstack --> sse[krateo-sse-proxy]
oasgencrd[oasgen-provider-crd] --> oasgen[oasgen-provider]
oasgen --> hyperdx[hyperdx-provider]
end
subgraph agents["agents"]
clickstack --> mcp[clickhouse-mcp-server]
kagentcrds[kagent-crds] --> kagent
mcp --> autopilot[krateo-autopilot]
kagent --> autopilot
end
chart/ the umbrella chart
Chart.yaml bootstrap subchart deps (core-provider, cert-manager, ...)
values.yaml spec surface + the components list (versions, deps, tiers)
values.schema.json full schema for the Installer spec
templates/
self-bootstrap.yaml bootstrap mode: installer CD + RBAC + post-install hook
definitions.yaml Pass A - emit component CompositionDefinitions
compositions.yaml Pass B - emit gated Compositions + exposure/config wiring
secret.yaml component secrets
ordered-teardown.yaml HOOK 1 - pre-delete reverse-dependency drain
bootstrap-teardown.yaml HOOK 2 - pre-delete full-footprint drain
post-delete-cleanup.yaml HOOK 3 - post-delete orphan sweep
_helpers.tpl inst.* helpers (apiVersion, crdExists, depsReady, lbip, ...)
compositiondefinition.yaml install the umbrella itself as a CompositionDefinition (advanced)
Pushing a semver tag triggers .github/workflows/release-oci.yaml, which packages and pushes
chart/ to oci://ghcr.io/braghettos/charts/installer:<tag> (CHART_VERSION is substituted
from the tag). Component charts live in their own braghettos/* repos and publish the same way.
When you change a component's pinned version (in chart/values.yaml), regenerate the typed
componentValues schema before tagging:
python3 hack/gen-componentvalues-schema.py chart # pulls each pinned component chart's
# values.schema.json into componentValues.<name>The installer version is the unit that manages component GVRs: a component's version sets its
Composition's served apiVersion + schema, and values.schema.json types componentValues against
those exact schemas — so a new component GVR ships as a new installer version with a regenerated
schema, never an in-place edit of a running install.