docs: refresh infra docs for post-Hetzner architecture#57
Open
abtreece wants to merge 1 commit into
Open
Conversation
66e915d to
cfbe8a1
Compare
Closes fullstaq-ruby#55. Brings docs/ in line with the post-July-2024 architecture (single Hetzner VM running Caddy + Sinatra/Puma API server + Prometheus, provisioned by Ansible), replacing references to the previous GKE Autopilot + Nginx Ingress + Cloud Run apiserver setup. Files changed: - docs/infrastructure-overview.md — rewritten section by section. Every claim is grounded in current IaC. The two GCP-service-account sections are folded into a single "CI/CD authentication" section that splits per-caller: server-edition uses GCP WIF (APT/YUM repo buckets + GCS CI artifacts bucket) and Azure Federated Identity Credentials (Azure Blob CI artifacts + CI cache + Key Vault GPG key); infra repo's apiserver workflow only mints a GitHub OIDC JWT (audience backend.fullstaqruby.org) and POSTs to /admin/upgrade_apiserver — it does not authenticate to GCP or Azure APIs. The Caddy section is corrected: there is no backend.fullstaqruby.org vhost; both apt. and yum. vhosts handle /admin/* via reverse_proxy to the apiserver Unix socket. The "Google Cloud projects" claim of two projects is corrected — there is one project, fsruby-server-edition2, provisioned by terraform-hisec/gcloud_project.tf and populated by terraform/; the hisec/non-hisec separation lives at the Terraform-state and access-group layer. Container registry section dropped (no registry resources are managed in this repo). Key Vault name uses the templated form ${var.key_vault_prefix}infraowners (currently fsruby2infraowners). CI artifacts/cache split is now explicit (artifacts dual-cloud, cache Azure-only). VM section distinguishes Terraform-managed forward DNS from the manually-set Hetzner PTR record. - docs/infrastructure-overview.drawio.svg — deleted. Replaced by an inline Mermaid diagram in infrastructure-overview.md so future diagram changes are reviewable as text diffs. - docs/editing-diagrams.md — deleted (no longer needed without the drawio round-trip). - docs/deploy.md — replaces the gcloud-clusters/kubectl steps with a single ansible-playbook step matching bootstrapping Step 11. Adds a callout that apiserver code changes deploy via the GitHub Actions workflow. - docs/infrastructure-as-code.md — drops Kustomize and the kubernetes/ directory bullet; adds Ansible to the tool list and an ansible/ directory bullet. - docs/infrastructure-bootstrapping.md — intro updated to mention Terraform + Ansible (not Kubernetes/Kustomize); the rest of the file already reflected the post-migration setup. - docs/pull_request_template.md — diagram-update checkbox now points to the Mermaid block instead of the deleted drawio file. - README.md — drops the link to the deleted editing-diagrams.md. - .editorconfig — removes the duplicate [config.ru] block (the tab/4 one); only the correct space/2 rule remains. Note: the "Github CI bot account" section is kept as-is. Retiring that PAT-based bot is already tracked in fullstaq-ruby#18 and is therefore out of scope here.
cfbe8a1 to
85b83bf
Compare
Member
|
I'll have a good look. So far my first impression is that the new diagram lacks a lot of detail that was in the older diagram. I'm also not sure whether a detailed but automatically rendered diagram is still readable compared to a manually drawn one. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #55.
Summary
Refreshes
docs/to describe the current infrastructure (single Hetzner VM running Caddy + Sinatra/Puma API server + Prometheus, provisioned by Ansible) instead of the pre-July-2024 architecture (GKE Autopilot + Nginx Ingress + Cloud Run apiserver). The infrastructure overview diagram is also refreshed: the staleinfrastructure-overview.drawio.svgis replaced by a Mermaid block embedded ininfrastructure-overview.mdso future diagram changes are reviewable as text diffs.Files changed
docs/infrastructure-overview.md— rewritten section by section against current IaC.fullstaq-ruby/server-edition→ GCP via Workload Identity Federation (APT/YUM repo buckets + GCS CI artifacts bucket).fullstaq-ruby/server-edition→ Azure via Federated Identity Credentials (Azure Blob CI artifacts + CI cache containers + Key Vault GPG key).fullstaq-ruby/infra→ API server only, via a GitHub-issued OIDC JWT (audiencebackend.fullstaqruby.org) sent toPOST /admin/upgrade_apiserver. The infra workflow does not authenticate to GCP or Azure APIs.backend.fullstaqruby.orgvhost; bothapt.andyum.vhosts handle/admin/*via reverse_proxy to the apiserver Unix socket (peransible/files/Caddyfile). CI calls/admin/*viahttps://apt.fullstaqruby.org.fsruby-server-edition2, display name "Fullstaq Ruby Server Edition"), provisioned byterraform-hisec/gcloud_project.tfand populated byterraform/. The hisec/non-hisec boundary lives at the Terraform-state and access-group layer, not at a GCP project boundary.apiserver-deployer.serviceperforms self-update from a tarball attached to a GitHub Release.backend.fullstaqruby.org,apt.fullstaqruby.org,yum.fullstaqruby.org) is distinguished from the manually-set Hetzner PTR record.${var.key_vault_prefix}infraowners(currentlyfsruby2infraowners).docs/infrastructure-overview.drawio.svg— deleted. Replaced by the Mermaid block ininfrastructure-overview.md.docs/editing-diagrams.md— deleted. Mermaid is edited inline; no diagrams.net round-trip is needed.docs/deploy.md— replaces thegcloud container clusters get-credentials+kubectl apply -k ../kubernetessteps with a singleansible-playbookstep matching Step 11 of the bootstrapping guide. Adds a callout that apiserver code changes deploy via the GitHub Actions workflow.docs/infrastructure-as-code.md— drops Kustomize and thekubernetes/directory bullet; adds Ansible to the tools list and anansible/directory bullet.docs/infrastructure-bootstrapping.md— intro updated to mention Terraform + Ansible (not Kubernetes/Kustomize). The rest of the file already reflected the post-migration setup.docs/pull_request_template.md— diagram-update checkbox now points to the Mermaid block.README.md— drops the link to the deletedediting-diagrams.md..editorconfig— removes the duplicate[config.ru]block (thetab/4one); only the correctspace/2rule remains.Verification
eclint check $(git ls-files)passes.grep -rin 'kubernetes\|kustomize\|kubectl\|gke\|nginx\|cloud run' docs/returns only intentional historical mentions (e.g. "the previous GKE Autopilot setup was replaced by this VM in the July 2024 rearchitecture").terraform/{dns,gcloud_auth,backend,repo_buckets,ci_storage}.tf,terraform-hisec/{gcloud_project,key_vault,backend}.tf,ansible/main.yml,ansible/files/{Caddyfile,apiserver.service},.github/workflows/apiserver.yml.Note: PAT-based CI bot
The "Github CI bot account" section describes a PAT-based bot. Retiring/converting that account is already tracked in #18 ("Change fullstaq-ruby-ci-bot account into a Github app") and is therefore intentionally not in scope here. The text remains as-is so the doc reflects the current state until #18 lands.