CM-1042: To add multi-operand qualification suite in e2e tests#429
CM-1042: To add multi-operand qualification suite in e2e tests#429arun717 wants to merge 16 commits into
Conversation
|
@arun717: This pull request references CM-1042 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a gated e2e suite validating concurrent TrustManager and IstioCSR operand behavior and improvements to the Vault e2e harness (timeouts, diagnostics, pinned image), plus a test module dependency adjustment. ChangesMulti-Operand E2E + Vault Changes
🎯 3 (Moderate) | ⏱️ ~25 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 3 warnings)
✅ Passed checks (11 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: arun717 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/multiple_operands_test.go`:
- Around line 119-122: The teardown code is discarding errors from Delete calls;
update the cleanup to check and handle returned errors from
trustManagerClient().Delete(ctx, "cluster", metav1.DeleteOptions{}) and from
certmanageroperatorclient.OperatorV1alpha1().IstioCSRs(istioNS.Name).Delete(ctx,
istioCSRResourceName, metav1.DeleteOptions{}) (and the other similar delete
sites noted) — e.g., capture the error, and if non-nil either log it via the
test logger/t.Fatalf or accumulate/report it so failures aren’t silently
ignored; ensure you reference istioNS and istioCSRResourceName when building the
diagnostic message.
- Around line 76-104: This e2e suite uses operator Subscription/OLM behavior and
must be skipped on MicroShift; add a MicroShift guard by annotating the Describe
block or adding a runtime check in BeforeAll: either add the label/tag
`[Skipped:MicroShift]` to the Describe(...) call or call
exutil.IsMicroShiftCluster() at the start of BeforeAll (or test setup) and call
Skip("MicroShift: OLM not available") when true; update the Describe signature
or the BeforeAll in multiple_operands_test.go (referencing the Describe(...) and
BeforeAll(...) symbols) to ensure the suite is not executed on MicroShift
clusters.
- Around line 488-494: The test creates a fixed namespace "cm786-bundle-target"
which can conflict on reruns; change the Namespace creation to use
ObjectMeta.GenerateName (e.g. "cm786-bundle-target-") instead of a fixed Name so
Kubernetes generates a unique name, leaving other fields (Labels) intact and
still assigning the result to targetNS from
k8sClientSet.CoreV1().Namespaces().Create; update any subsequent references that
rely on the exact name to use targetNS.Name.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: d2e57b5d-e96f-46d8-9ee5-f93c808b598b
📒 Files selected for processing (1)
test/e2e/multiple_operands_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/utils_test.go`:
- Around line 1737-1760: formatVaultPodsStatus currently writes raw container
and condition messages (pod.Status.ContainerStatuses and pod.Status.Conditions)
to CI logs; change it to avoid emitting full freeform messages by only logging
structured reason codes and statuses (e.g., cs.State.Waiting.Reason,
cs.State.Terminated.Reason, cond.Type and cond.Status) and either omit
cond.Message/cs.*.Message entirely or replace them with a sanitized placeholder
or a fixed-length truncated/redacted string; update the loop over
pod.Status.ContainerStatuses and pod.Status.Conditions in formatVaultPodsStatus
to remove raw message output and include a clear placeholder like "<redacted>"
or a short truncation function instead.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 9d7652ef-c5ea-4553-a6ee-9931aeaf8b54
📒 Files selected for processing (4)
test/e2e/issuer_vault_test.gotest/e2e/testdata/vault/helm-values.yamltest/e2e/utils_test.gotest/go.mod
✅ Files skipped from review due to trivial changes (1)
- test/e2e/testdata/vault/helm-values.yaml
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/e2e/issuer_vault_test.go (1)
103-142:⚠️ Potential issue | 🟠 Major | ⚡ Quick winStart the test-body timeout after the Vault bootstrap.
Line 104 starts a 10-minute
ctxbefore the 30-minute setup path on Lines 120-141. On a slow cluster, setup can finish withctxalready expired, and every later client call in theItbody will then fail immediately withcontext deadline exceeded. Create the post-setup context after Vault is ready, or give the test body its own fresh timeout. Based on learnings, "Applies to **/*_test.go : Review Ginkgo test code for quality requirements: (3) Timeouts - include appropriate timeouts on Eventually/Consistently and cluster operations."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/issuer_vault_test.go` around lines 103 - 142, The test creates the test-body context (ctx, cancel) with highTimeout before the long Vault setup so that ctx can expire during setup; fix by creating the test-body context only after Vault bootstrap completes (i.e., move the context.WithTimeout(ctx, highTimeout) / DeferCleanup(cancel) lines to after configureVaultPKI returns) or alternatively create a fresh context for the It test body after setup (e.g., keep setupCtx/setupCancel for setup and then call context.WithTimeout for ctx/cancel afterwards); update references to ctx/cancel accordingly so all subsequent client calls use the post-setup ctx.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/issuer_vault_test.go`:
- Around line 154-155: The test uses execVaultShellInPod with shell-interpolated
strings (e.g., vaultCmd at the top and similar blocks at other spots) which
risks shell injection and unpredictable parsing; replace those with multiple
calls to execVaultInPod (or a new helper that accepts argv slices) passing the
vault binary and each argument as discrete strings (e.g.,
"vault","auth","enable","approle" and
"vault","write","auth/approle/role/<role>","token_policies=cert-manager",...).
Change each chained or sh -c invocation (references: execVaultShellInPod,
vaultCmd, and the JWT/oidc setup blocks called around the other ranges) to split
into separate execVaultInPod calls with explicit args so no shell interpolation
is used.
In `@test/e2e/utils_test.go`:
- Around line 1461-1475: The current execVaultInPod and execVaultShellInPod
embed the rootToken in the command argv which ends up in the PodExecOptions
payload; change the approach so the token is not passed as an exec argument:
update execInPod to support streaming stdin (set PodExecOptions.Stdin = true)
and have execVaultInPod / execVaultShellInPod send the VAULT_TOKEN value over
stdin (or another out-of-band channel) instead of building
"VAULT_TOKEN=<rootToken>" into the command slice; keep the visible command
arguments free of secrets and reference execInPod, execVaultInPod and
execVaultShellInPod when making the change.
---
Outside diff comments:
In `@test/e2e/issuer_vault_test.go`:
- Around line 103-142: The test creates the test-body context (ctx, cancel) with
highTimeout before the long Vault setup so that ctx can expire during setup; fix
by creating the test-body context only after Vault bootstrap completes (i.e.,
move the context.WithTimeout(ctx, highTimeout) / DeferCleanup(cancel) lines to
after configureVaultPKI returns) or alternatively create a fresh context for the
It test body after setup (e.g., keep setupCtx/setupCancel for setup and then
call context.WithTimeout for ctx/cancel afterwards); update references to
ctx/cancel accordingly so all subsequent client calls use the post-setup ctx.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 399d5496-f611-4543-bc93-bfcee40a183f
📒 Files selected for processing (2)
test/e2e/issuer_vault_test.gotest/e2e/utils_test.go
|
/retest-required |
|
/test e2e-operator-tech-preview |
Scope solver pod lookup to the test namespace and drop the redundant certificate readiness wait already covered by the sibling ingress spec. Co-authored-by: Cursor <cursoragent@cursor.com>
Wait for TrustManager and IstioCSR deployments before CR Ready polling, extend multi-operand timeouts for slow CI, and surface operand status on failure. Co-authored-by: Cursor <cursoragent@cursor.com>
Only fail fast on Degraded in the multi-operand install path. The trust namespace recovery test waits through a transient Degraded state before Ready becomes True again. Co-authored-by: Cursor <cursoragent@cursor.com>
The istio_csr_template defaults spec.istio.namespace to istio-system, but CM-786 multi-operand tests install into multi-operand-istio. Without overriding the field, IstioCSR reconciliation fails creating RBAC in the missing istio-system namespace. Co-authored-by: Cursor <cursoragent@cursor.com>
IstioCSR ClusterRoles use GenerateName, so the controller recreates them with a new suffix after deletion rather than restoring the same name. Co-authored-by: Cursor <cursoragent@cursor.com>
Align the grpcurl job name with grpcurl_job.yaml and create the proto-cm ConfigMap required by the job, matching istio_csr_test.go. Co-authored-by: Cursor <cursoragent@cursor.com>
The multi-operand suite runs IstioCSR in multi-operand-istio, not istio-system. Build the SPIFFE ID from the actual namespace so grpcurl can obtain a certificate instead of retrying until timeout. Co-authored-by: Cursor <cursoragent@cursor.com>
The vault-installer pod was rejected on OCP namespaces enforcing restricted Pod Security. Set the required pod and container security context instead of relying on privileged mode. Co-authored-by: Cursor <cursoragent@cursor.com>
|
@arun717: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Adds an ordered Ginkgo e2e suite for CM-786 qualification when TrustManager and IstioCSR are enabled and installed at the same time. The suite patches the operator subscription to turn on both addon feature gates, creates both CRs concurrently, then runs baseline cert-manager sanity checks and automatable CM-786 doc steps (5–14).
Labels: Platform:Generic, Feature:MultipleOperands, TechPreview
https://docs.google.com/document/d/1bKI2OP7cZPVkGMWGhqEgJ6cCM35r2pUs7lqfDClUD9A/edit?tab=t.ovzsdk8nx64c#heading=h.edvlw82015i9
Run:
make test-e2e E2E_GINKGO_LABEL_FILTER='Platform:Generic && Feature:MultipleOperands'Prerequisites: TechPreview cluster with cert-manager operator installed; OpenShift Service Mesh optional (steps 6–7 skip if absent).
Tests (Context: with operands installed concurrently)
Summary by CodeRabbit