WIP: Add pkg/rdsdiff: reusable library for RDS version diffing by Tal-or · Pull Request #271 · openshift/kube-compare

Tal-or · 2026-04-15T11:55:39Z

Add a reusable library for comparing two versions of a telco RDS (Reference Design Specification). Given two downloadable RDS sources (GitHub tree URLs or direct archive links), the library handles the full workflow: download and extract archives, run the PolicyGenerator binary to produce policies, extract individual CRs from the generated ACM Policy documents, normalize and diff the CRs, and produce both a human-readable text report and a structured JSON comparison summary.

This package is intended to be consumed by tools such as kube-compare-mcp, which can wrap it as a thin MCP tool handler without embedding domain logic.

NOTE:
At the moment the tool can perform diffs only for telco-ran RDS

Signed-off-by: Talor Itzhak titzhak@redhat.com
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

Summary by CodeRabbit

New Features
- Extract embedded policy resources from generated YAML, normalize/serialize them deterministically, correlate old vs new sets, and produce human-readable diff reports plus a structured JSON summary.
- Download and safely extract remote archives (zip/tar.gz), detect effective roots, and validate configuration layouts.
- Session-scoped workspace helpers and a policy-generation runner to produce generated policy artifacts.
- Added exported constants for policy source paths and a small policy list.
Tests
- Comprehensive tests covering extraction, normalization, diffing, download/extract, session handling, policy generation, and report/JSON outputs.

openshift-ci · 2026-04-15T11:55:53Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign natifridman for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-04-15T12:02:29Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Implements an rdsdiff package: downloads/extracts archives, runs PolicyGenerator, extracts embedded Policy CRs into deterministic per-CR YAMLs, correlates keys, normalizes YAML, computes unified diffs, and emits a text report plus a JSON comparison; adds session, download, policy-gen, file-IO, storage, and constants utilities and tests.

Changes

Cohort / File(s)	Summary
Comparison & Reports `pkg/rdsdiff/compare.go`, `pkg/rdsdiff/compare_test.go`	New comparison pipeline: extract embedded CRs from Policy multi-doc YAMLs, generate stable CR keys, write one YAML per CR, correlate key sets (onlyOld/onlyNew/inBoth), normalize YAML for stable comparison, compute unified diffs for inBoth, and write `diff-report.txt` and `comparison.json`; comprehensive tests for extraction, keying, normalization, diffing, and run paths.
Download & Extraction `pkg/rdsdiff/download.go`, `pkg/rdsdiff/download_test.go`	Download helpers supporting GitHub `/tree/...` archive resolution and direct URLs, detect/extract zip and tar.gz payloads, zip-slip protection, per-file size bounds, detect effective root directory, and ValidateConfigurationRoot checks; tests for parsing and validation.
Policy Generation `pkg/rdsdiff/policygen.go`	Copies `source-crs` into the configured CRS workspace, invokes external PolicyGenerator per configured policy list with timeout, captures output, writes generated YAMLs, and surfaces contextual errors.
Session & Storage `pkg/rdsdiff/storage.go`	Session directory creation (`SessionDir`), session identifier (`SessionID`), `ResolveSessionPath` with validation against path-escape attacks, and exported `ErrInvalidSessionID`.
File IO Helpers `pkg/rdsdiff/fileio.go`	Unexported safe file utilities: `readCleanFile`, `writeCleanFile`, `readCleanDir` with path cleaning and wrapped error messages.
Constants / Config `pkg/rdsdiff/constants.go`	Adds exported constants `CRSPath`, `SourceCRSPath`, and `ListOfCRsForSNO` (expected policy filenames).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant RDSDiff
    participant GitHub
    participant PolicyGen
    participant FS as Filesystem

    User->>RDSDiff: RunCompare(oldURL, newURL, sessionDir)
    RDSDiff->>GitHub: Download/extract old archive
    GitHub-->>FS: extracted old files
    RDSDiff->>GitHub: Download/extract new archive
    GitHub-->>FS: extracted new files

    RDSDiff->>PolicyGen: RunPolicyGen on extracted dirs
    PolicyGen-->>FS: generated multi-doc YAMLs
    RDSDiff->>RDSDiff: ExtractCRsFromPolicyDoc -> write per-CR YAMLs to extractedDir
    RDSDiff->>RDSDiff: Correlate keys -> onlyOld/onlyNew/inBoth
    loop for each key in inBoth
        RDSDiff->>RDSDiff: NormalizeYAML(old) & NormalizeYAML(new)
        RDSDiff->>RDSDiff: ComputeDiff -> produce diff text
    end
    RDSDiff->>FS: Write diff-report.txt and comparison.json
    RDSDiff-->>User: Return report path / summary

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 9 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 37.31% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (9 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and accurately describes the main change: adding a new reusable library package (pkg/rdsdiff) for RDS version diffing.
Stable And Deterministic Test Names	✅ Passed	PR uses standard Go testing package with static test function names, not Ginkgo framework, so no Ginkgo test titles exist to contain dynamic values.
Test Structure And Quality	✅ Passed	The custom check is designed for Ginkgo BDD-style tests, but the PR uses standard Go testing patterns with testify assertions, making the check not applicable.
Microshift Test Compatibility	✅ Passed	The PR introduces only standard Go unit tests, not Ginkgo e2e tests, and the new library code does not interact with Kubernetes or OpenShift APIs, making the MicroShift compatibility check inapplicable.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	The PR adds only standard Go unit tests using testify, not Ginkgo e2e tests. The custom check applies only to Ginkgo e2e tests with patterns like It(), Describe(), Context(), or When(), none of which are present.
Topology-Aware Scheduling Compatibility	✅ Passed	This PR adds a utility library package (pkg/rdsdiff) for comparing RDS versions with no Kubernetes deployment manifests, scheduling constraints, affinity rules, or topology assumptions.
Ote Binary Stdout Contract	✅ Passed	The new rdsdiff package code does not violate the OTE Binary Stdout Contract as no process-level code paths write to stdout.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	PR contains only standard Go unit tests using testing package, not Ginkgo e2e tests, so check is not applicable.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (4)

pkg/rdsdiff/policygen.go (1)
41-46: External command execution lacks timeout.

exec.Command runs without a timeout, which could cause the function to hang indefinitely if PolicyGenerator stalls or enters an infinite loop. Consider using exec.CommandContext with a timeout context.
🛡️ Suggested fix
+import (
+	"context"
+	// ... other imports
+)
+
+// Add timeout constant at package level
+const PolicyGenTimeout = 5 * time.Minute

 func RunPolicyGen(configRoot, policyGeneratorPath, generatedDir string) error {
 	// ... existing code ...
 	for _, policyFile := range ListOfCRsForSNO {
 		// ... existing stat check ...
-		cmd := exec.Command(policyGeneratorPath, policyPath) // `#nosec` G204 -- paths from config root
+		ctx, cancel := context.WithTimeout(context.Background(), PolicyGenTimeout)
+		cmd := exec.CommandContext(ctx, policyGeneratorPath, policyPath) // `#nosec` G204 -- paths from config root
 		cmd.Dir = crsPathDir
 		out, err := cmd.CombinedOutput()
+		cancel()
 		if err != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/policygen.go` around lines 41 - 46, The external command creation
in the policy generation path uses exec.Command without a timeout (see
policyGeneratorPath, policyPath, cmd, crsPathDir and policyFile) which can hang;
replace exec.Command with exec.CommandContext using a context.WithTimeout (pick
a sensible timeout), defer cancel the context, then run cmd.CombinedOutput() (or
Start/Wait) so the child is killed when the context expires and return an error
that includes the timeout/cancellation details and the command output.
pkg/rdsdiff/compare.go (1)
140-163: Multi-document YAML splitting may miss edge cases.

Splitting on "\n---" (Line 142) won't correctly handle:

Documents starting with ---\n at the beginning of the file (no preceding newline)

--- with trailing content on the same line (valid YAML directive marker)

The fallback on Lines 156-161 partially mitigates this for single-document files, but multi-document files with a leading --- marker may still have issues.
♻️ Consider a more robust approach
 func unmarshalMultiDocYAML(data []byte) []map[string]any {
 	var docs []map[string]any
-	parts := strings.Split(string(data), "\n---")
+	// Handle leading --- and split on document separators
+	content := string(data)
+	content = strings.TrimPrefix(content, "---\n")
+	content = strings.TrimPrefix(content, "---\r\n")
+	parts := strings.Split(content, "\n---")
 	for _, part := range parts {
Alternatively, consider using a proper YAML streaming decoder that handles multi-document files natively.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/compare.go` around lines 140 - 163, The current
unmarshalMultiDocYAML function incorrectly splits multi-document YAML by
"\n---"; replace the ad-hoc splitting with a proper YAML streaming decoder (use
a yaml Decoder on the input bytes and call Decode in a loop) so it correctly
handles leading document markers (---), markers with trailing content, and other
edge cases; update unmarshalMultiDocYAML to iterate Decode into map[string]any
until io.EOF and append each non-nil map to docs, falling back only if no
documents were produced.
pkg/rdsdiff/storage.go (1)
49-54: Path validation may reject valid session IDs or miss edge cases.

The condition cleaned != sessionID on Line 49 can reject valid session IDs if filepath.Clean normalizes them (e.g., trailing spaces that TrimSpace already removed). Additionally, Line 52 only checks for filepath.Separator when it's not /, but the sessionID could still contain forward slashes on Windows which would pass this check but create subdirectories.

Consider simplifying by checking for any path separator character:
♻️ Suggested improvement
 	cleaned := filepath.Clean(sessionID)
-	if cleaned != sessionID || strings.Contains(sessionID, "..") || filepath.IsAbs(sessionID) {
+	if cleaned != sessionID || strings.Contains(sessionID, "..") || filepath.IsAbs(sessionID) || strings.ContainsAny(sessionID, `/\`) {
 		return "", ErrInvalidSessionID
 	}
-	if filepath.Separator != '/' && strings.ContainsRune(sessionID, filepath.Separator) {
-		return "", ErrInvalidSessionID
-	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/storage.go` around lines 49 - 54, Replace the brittle cleaned !=
sessionID check and the OS-only separator test with a simpler, robust
validation: first trim whitespace (sessionID = strings.TrimSpace(sessionID));
then reject if filepath.IsAbs(sessionID) or strings.Contains(sessionID, "..");
and reject if sessionID contains any path separator by checking both '/' and
filepath.Separator (e.g., strings.ContainsAny(sessionID,
"/"+string(filepath.Separator))). Remove the reliance on filepath.Clean
comparison so valid but normalized IDs aren't incorrectly rejected.
pkg/rdsdiff/download.go (1)
254-256: Comment rationale for #nosec G110 is incorrect.

The comment states "size bounded by client timeout" but HTTP client timeout bounds time, not size. A fast connection could download a very large file within the timeout. Consider using io.LimitReader to bound the extracted file size, or document the expected maximum archive size as an operational constraint.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/download.go` around lines 254 - 256, The `#nosec` G110
justification is wrong—timeout bounds time not bytes—so remove that comment and
enforce a size limit when extracting: introduce a constant like
MaxExtractedFileSize and wrap the source reader rc with io.LimitReader(rc,
MaxExtractedFileSize) before calling io.Copy(dst, ...), then detect if the copy
hit the limit (bytes == MaxExtractedFileSize) and return a clear error; update
the io.Copy(dst, rc) call site and any surrounding error handling in this
download/extract code to use the new limited reader and fail if the extracted
file exceeds the configured maximum, or alternatively document the operational
maximum if opting not to enforce it.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/rdsdiff/compare.go`:
- Around line 432-433: Replace the ignored returns from os.ReadFile by capturing
the error for both oldContent and newContent (e.g. oldContent, err :=
os.ReadFile(oldPath) and newContent, err := os.ReadFile(newPath)), and handle
errors instead of discarding them: if err != nil return fmt.Errorf("reading %s:
%w", oldPath, err) (and similarly for newPath) or log the error with the package
logger and return; update the surrounding function (the compare code that calls
os.ReadFile) to propagate the error up the stack so failures reading the files
aren’t silently ignored.

In `@pkg/rdsdiff/download.go`:
- Around line 343-348: The tar path detection uses OS-specific
filepath.Separator when splitting entries (the code around variables first and
topLevel), which breaks on Windows; change the split to use the tar-standard
forward slash (e.g. use strings.SplitN(name, "/", 2) or otherwise split on '/')
so first is derived from the tar path correctly and topLevel detection (the
topLevel variable logic) works on all platforms.

In `@pkg/rdsdiff/storage.go`:
- Around line 56-57: The calls to filepath.Abs assigned to absWork and
absSession ignore errors; capture the returned errors (e.g., absWork, err :=
filepath.Abs(workDir) and absSession, err := filepath.Abs(sessionPath)) and
handle them rather than discarding—either return a wrapped error from the
current function (include context like "failed to get absolute path for
workDir/sessionPath") or log and fail appropriately; update the surrounding
function in pkg/rdsdiff/storage.go to check these err values before proceeding
to use filepath.Rel or other path operations.

---

Nitpick comments:
In `@pkg/rdsdiff/compare.go`:
- Around line 140-163: The current unmarshalMultiDocYAML function incorrectly
splits multi-document YAML by "\n---"; replace the ad-hoc splitting with a
proper YAML streaming decoder (use a yaml Decoder on the input bytes and call
Decode in a loop) so it correctly handles leading document markers (---),
markers with trailing content, and other edge cases; update
unmarshalMultiDocYAML to iterate Decode into map[string]any until io.EOF and
append each non-nil map to docs, falling back only if no documents were
produced.

In `@pkg/rdsdiff/download.go`:
- Around line 254-256: The `#nosec` G110 justification is wrong—timeout bounds
time not bytes—so remove that comment and enforce a size limit when extracting:
introduce a constant like MaxExtractedFileSize and wrap the source reader rc
with io.LimitReader(rc, MaxExtractedFileSize) before calling io.Copy(dst, ...),
then detect if the copy hit the limit (bytes == MaxExtractedFileSize) and return
a clear error; update the io.Copy(dst, rc) call site and any surrounding error
handling in this download/extract code to use the new limited reader and fail if
the extracted file exceeds the configured maximum, or alternatively document the
operational maximum if opting not to enforce it.

In `@pkg/rdsdiff/policygen.go`:
- Around line 41-46: The external command creation in the policy generation path
uses exec.Command without a timeout (see policyGeneratorPath, policyPath, cmd,
crsPathDir and policyFile) which can hang; replace exec.Command with
exec.CommandContext using a context.WithTimeout (pick a sensible timeout), defer
cancel the context, then run cmd.CombinedOutput() (or Start/Wait) so the child
is killed when the context expires and return an error that includes the
timeout/cancellation details and the command output.

In `@pkg/rdsdiff/storage.go`:
- Around line 49-54: Replace the brittle cleaned != sessionID check and the
OS-only separator test with a simpler, robust validation: first trim whitespace
(sessionID = strings.TrimSpace(sessionID)); then reject if
filepath.IsAbs(sessionID) or strings.Contains(sessionID, ".."); and reject if
sessionID contains any path separator by checking both '/' and
filepath.Separator (e.g., strings.ContainsAny(sessionID,
"/"+string(filepath.Separator))). Remove the reliance on filepath.Clean
comparison so valid but normalized IDs aren't incorrectly rejected.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 791f272e-b11d-4437-8677-02096735b34d

📥 Commits

Reviewing files that changed from the base of the PR and between 6fcb0e9 and 00118b7.

📒 Files selected for processing (7)

pkg/rdsdiff/compare.go
pkg/rdsdiff/compare_test.go
pkg/rdsdiff/constants.go
pkg/rdsdiff/download.go
pkg/rdsdiff/download_test.go
pkg/rdsdiff/policygen.go
pkg/rdsdiff/storage.go

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

pkg/rdsdiff/download.go (1)

337-350: ⚠️ Potential issue | 🟠 Major

Tar root detection can still return the wrong directory.

filepath.Clean(header.Name) normalizes tar paths to OS separators before Line 346 splits on /, so the Windows case from the earlier review is still broken. Also, once a second top-level entry is seen, setting topLevel = "" is not sticky—a later entry can overwrite it and make extractTarGz return a bogus root.

Suggested fix

+import "path"
+
-		name := strings.TrimPrefix(filepath.Clean(header.Name), ".")
+		name := strings.TrimPrefix(path.Clean(header.Name), ".")
 		name = strings.TrimPrefix(name, "/")
@@
-	var topLevel string
+	var topLevel string
+	multipleTopLevels := false
@@
-		first := strings.SplitN(name, "/", 2)[0]
-		if topLevel == "" {
-			topLevel = first
-		} else if topLevel != first {
-			topLevel = ""
-		}
+		first := strings.SplitN(name, "/", 2)[0]
+		if topLevel == "" {
+			topLevel = first
+		} else if topLevel != first {
+			multipleTopLevels = true
+		}
@@
-	if topLevel != "" {
+	if topLevel != "" && !multipleTopLevels {
 		return filepath.Join(destDir, topLevel), nil
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/download.go` around lines 337 - 350, The tar root detection is
broken because filepath.Clean normalizes to OS separators before splitting on
'/' and because resetting topLevel to "" on a conflict can be overwritten later;
change the normalization to use forward-slash-aware cleaning (e.g. use
path.Clean or filepath.ToSlash on header.Name and then split on '/') so
splitting uses '/' consistently across platforms, and make the "ambiguous" state
sticky by tracking a boolean (e.g. topLevelAmbiguous) or only allowing topLevel
to be set when it hasn't been marked ambiguous; update the logic around
header.Name, filepath.Clean, safeJoinPath, the topLevel variable and the
extractTarGz root-detection block so once a conflict sets ambiguity it cannot be
later overwritten.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/rdsdiff/compare.go`:
- Around line 305-317: The YAML unmarshal logic in ComputeDiff currently
replaces failed parses with empty maps which can hide errors and produce a false
"no diff"; update ComputeDiff in pkg/rdsdiff/compare.go so that if either
sigsyaml.Unmarshal on oldData or newData fails (capture the error returned by
each call), you do not proceed with normalized map-based diffing but instead
fall back to computing a raw text diff of the original oldData/newData byte
slices (or their string forms) and return that result (preserving any error from
the diff operation) so callers see a meaningful diff when YAML normalization is
impossible; reference the existing local vars oldObj, newObj and the
sigsyaml.Unmarshal calls to implement the detection and fall-back behavior.

In `@pkg/rdsdiff/download.go`:
- Around line 257-259: Reject oversized archive entries instead of truncating or
streaming them unbounded: in the ZIP extraction path check f.UncompressedSize64
against MaxExtractedFileSize and return an error if it exceeds the limit (do not
rely on io.LimitReader + io.Copy to truncate), and in the TAR extraction path
(tar.TypeReg case) check header.Size before reading from rc and return an error
if header.Size > MaxExtractedFileSize; update the error messages where
io.Copy(dst, io.LimitReader(rc, MaxExtractedFileSize)) is used to reflect the
pre-check and ensure no partial writes occur to dst when sizes are exceeded.

---

Duplicate comments:
In `@pkg/rdsdiff/download.go`:
- Around line 337-350: The tar root detection is broken because filepath.Clean
normalizes to OS separators before splitting on '/' and because resetting
topLevel to "" on a conflict can be overwritten later; change the normalization
to use forward-slash-aware cleaning (e.g. use path.Clean or filepath.ToSlash on
header.Name and then split on '/') so splitting uses '/' consistently across
platforms, and make the "ambiguous" state sticky by tracking a boolean (e.g.
topLevelAmbiguous) or only allowing topLevel to be set when it hasn't been
marked ambiguous; update the logic around header.Name, filepath.Clean,
safeJoinPath, the topLevel variable and the extractTarGz root-detection block so
once a conflict sets ambiguity it cannot be later overwritten.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 2a90bf8d-f12c-4c39-845e-2d5f5e16efb8

📥 Commits

Reviewing files that changed from the base of the PR and between 00118b7 and a083dd6.

📒 Files selected for processing (4)

pkg/rdsdiff/compare.go
pkg/rdsdiff/download.go
pkg/rdsdiff/policygen.go
pkg/rdsdiff/storage.go

✅ Files skipped from review due to trivial changes (1)

pkg/rdsdiff/storage.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/rdsdiff/policygen.go

coderabbitai · 2026-04-16T07:17:13Z

+	var oldObj, newObj map[string]any
+	if err := sigsyaml.Unmarshal(oldData, &oldObj); err != nil {
+		oldObj = nil
+	}
+	if err := sigsyaml.Unmarshal(newData, &newObj); err != nil {
+		newObj = nil
+	}
+	if oldObj == nil {
+		oldObj = make(map[string]any)
+	}
+	if newObj == nil {
+		newObj = make(map[string]any)
+	}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find the file and examine the compute.go to understand the function context find . -name "compare.go" -path "*/rdsdiff/*" | head -20

Repository: openshift/kube-compare

Length of output: 90

🏁 Script executed:

# Get the file structure and size wc -l pkg/rdsdiff/compare.go 2>/dev/null || find . -name "compare.go" -path "*rdsdiff*" -exec wc -l {} \;

Repository: openshift/kube-compare

Length of output: 92

🏁 Script executed:

# Read the function containing lines 305-317 to understand context sed -n '280,330p' pkg/rdsdiff/compare.go

Repository: openshift/kube-compare

Length of output: 1512

🏁 Script executed:

# Find the computeDiff function signature and return type rg "func.*computeDiff|func.*ComputeDiff" pkg/rdsdiff/ -A 5

Repository: openshift/kube-compare

Length of output: 1859

🏁 Script executed:

# Check how unmarshal failures are actually handled - look at the full function sed -n '1,350p' pkg/rdsdiff/compare.go | tail -100

Repository: openshift/kube-compare

Length of output: 2564

Fall back to raw diff when YAML parsing fails to prevent false "no diff" results.

If both sigsyaml.Unmarshal calls fail, both sides become empty maps and ComputeDiff returns an empty diff, falsely indicating equality for malformed YAML files. Since ComputeDiff is exported and returns (string, error) without signaling parse errors, callers cannot detect this silent failure. Fall back to diffing the raw file contents when YAML normalization is impossible.

Suggested fix

var oldObj, newObj map[string]any - if err := sigsyaml.Unmarshal(oldData, &oldObj); err != nil { - oldObj = nil - } - if err := sigsyaml.Unmarshal(newData, &newObj); err != nil { - newObj = nil + oldErr := sigsyaml.Unmarshal(oldData, &oldObj) + newErr := sigsyaml.Unmarshal(newData, &newObj) + if oldErr != nil || newErr != nil { + return unifiedDiff(string(oldData), string(newData), filepath.Base(oldPath), filepath.Base(newPath)), nil } if oldObj == nil { oldObj = make(map[string]any) }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/rdsdiff/compare.go` around lines 305 - 317, The YAML unmarshal logic in ComputeDiff currently replaces failed parses with empty maps which can hide errors and produce a false "no diff"; update ComputeDiff in pkg/rdsdiff/compare.go so that if either sigsyaml.Unmarshal on oldData or newData fails (capture the error returned by each call), you do not proceed with normalized map-based diffing but instead fall back to computing a raw text diff of the original oldData/newData byte slices (or their string forms) and return that result (preserving any error from the diff operation) so callers see a meaningful diff when YAML normalization is impossible; reference the existing local vars oldObj, newObj and the sigsyaml.Unmarshal calls to implement the detection and fall-back behavior.

coderabbitai · 2026-04-16T07:17:13Z

+	if _, err = io.Copy(dst, io.LimitReader(rc, MaxExtractedFileSize)); err != nil { // #nosec G110 -- size bounded by MaxExtractedFileSize
+		return fmt.Errorf("copy %s: %w", f.Name, err)
+	}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find and read the download.go file to see the current implementation find . -name "download.go" -path "*/rdsdiff/*" | head -5

Repository: openshift/kube-compare

Length of output: 91

🏁 Script executed:

# Get the file size to decide how to read it wc -l pkg/rdsdiff/download.go

Repository: openshift/kube-compare

Length of output: 93

🏁 Script executed:

# Read the relevant sections of the file sed -n '250,275p' pkg/rdsdiff/download.go

Repository: openshift/kube-compare

Length of output: 979

🏁 Script executed:

# Also read the tar extraction section around line 366-369 sed -n '360,380p' pkg/rdsdiff/download.go

Repository: openshift/kube-compare

Length of output: 633

🏁 Script executed:

# Search for MaxExtractedFileSize definition rg "MaxExtractedFileSize\s*=" pkg/rdsdiff/

Repository: openshift/kube-compare

Length of output: 136

🏁 Script executed:

# Find the extractZipFile function ast-grep --pattern $'func extractZipFile($_, $_) $_ { $$$ }'

Repository: openshift/kube-compare

Length of output: 1470

🏁 Script executed:

# Find tar extraction function and search for tar.TypeReg handling ast-grep --pattern $'case tar.TypeReg: $$$'

Repository: openshift/kube-compare

Length of output: 48

🏁 Script executed:

# Search for the tar extraction context to see the full picture around line 366-369 sed -n '340,380p' pkg/rdsdiff/download.go

Repository: openshift/kube-compare

Length of output: 1129

🏁 Script executed:

# Look for the tar.TypeReg case handling sed -n '355,375p' pkg/rdsdiff/download.go

Repository: openshift/kube-compare

Length of output: 762

The extraction size cap is not actually enforced.

On line 257, io.LimitReader truncates oversized zip entries and io.Copy still succeeds, so the library can diff partial files as if extraction was successful. On line 366, tar entries have no equivalent cap at all, so a crafted tarball can still fill the session volume. Reject entries larger than MaxExtractedFileSize instead of truncating or streaming them unbounded.

For zip files, pre-check f.UncompressedSize64 and return an error if it exceeds MaxExtractedFileSize. For tar files, check header.Size in the tar.TypeReg case before streaming.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/rdsdiff/download.go` around lines 257 - 259, Reject oversized archive entries instead of truncating or streaming them unbounded: in the ZIP extraction path check f.UncompressedSize64 against MaxExtractedFileSize and return an error if it exceeds the limit (do not rely on io.LimitReader + io.Copy to truncate), and in the TAR extraction path (tar.TypeReg case) check header.Size before reading from rc and return an error if header.Size > MaxExtractedFileSize; update the error messages where io.Copy(dst, io.LimitReader(rc, MaxExtractedFileSize)) is used to reflect the pre-check and ensure no partial writes occur to dst when sizes are exceeded.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (2)

pkg/rdsdiff/download.go (1)

247-259: ⚠️ Potential issue | 🟠 Major

Reject oversized archive members instead of truncating them.

Line 257 and Line 366 use io.LimitReader, which turns an oversized ZIP/TAR member into a partial write that still looks successful. That means truncated YAML can enter the diff pipeline as if extraction succeeded. Pre-check f.UncompressedSize64 and header.Size, and fail before creating the output file.

Proposed fix

 func extractZipFile(f *zip.File, destDirClean string) error {
+	if f.UncompressedSize64 > uint64(MaxExtractedFileSize) {
+		return fmt.Errorf("zip entry %s exceeds max size %d bytes", f.Name, MaxExtractedFileSize)
+	}
 	dstPath, err := safeJoinPath(destDirClean, f.Name)
 	if err != nil {
 		return err
 	}
@@
 	if err != nil {
 		return fmt.Errorf("create %s: %w", dstPath, err)
 	}
 	defer dst.Close()
-	if _, err = io.Copy(dst, io.LimitReader(rc, MaxExtractedFileSize)); err != nil { // `#nosec` G110 -- size bounded by MaxExtractedFileSize
+	if _, err = io.Copy(dst, rc); err != nil {
 		return fmt.Errorf("copy %s: %w", f.Name, err)
 	}
 	return nil
 }
@@
 		case tar.TypeReg:
+			if header.Size > MaxExtractedFileSize {
+				return "", fmt.Errorf("tar entry %s exceeds max size %d bytes", name, MaxExtractedFileSize)
+			}
 			if err := os.MkdirAll(filepath.Dir(dstPath), 0o750); err != nil {
 				return "", fmt.Errorf("mkdir parent of %s: %w", dstPath, err)
 			}
 			outFile, err := os.Create(dstPath) // `#nosec` G304 -- dstPath from safeJoinPath
 			if err != nil {
 				return "", fmt.Errorf("create %s: %w", dstPath, err)
 			}
-			if _, err = io.Copy(outFile, io.LimitReader(tr, MaxExtractedFileSize)); err != nil { // `#nosec` G110 -- size bounded by MaxExtractedFileSize
+			if _, err = io.Copy(outFile, tr); err != nil {
 				outFile.Close()
 				return "", fmt.Errorf("copy %s: %w", name, err)
 			}

Also applies to: 358-369

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/download.go` around lines 247 - 259, Before creating dstPath and
copying bytes, validate the archive member size against MaxExtractedFileSize and
return an error if it exceeds the limit instead of using io.LimitReader;
specifically, check f.UncompressedSize64 for ZIP entries and header.Size for TAR
entries (the places that currently call io.LimitReader when copying to dst in
pkg/rdsdiff/download.go), and fail early prior to os.Create(dstPath) so
truncated writes cannot occur.

pkg/rdsdiff/compare.go (1)

305-317: ⚠️ Potential issue | 🟠 Major

Fall back to a raw diff when YAML normalization cannot parse one side.

If both unmarshals fail, oldObj and newObj become empty maps and ComputeDiff returns an empty diff for two different invalid files. When either unmarshal fails, return a raw text diff of oldData vs newData instead of normalizing empty objects.

Proposed fix

 	var oldObj, newObj map[string]any
-	if err := sigsyaml.Unmarshal(oldData, &oldObj); err != nil {
-		oldObj = nil
-	}
-	if err := sigsyaml.Unmarshal(newData, &newObj); err != nil {
-		newObj = nil
-	}
+	oldErr := sigsyaml.Unmarshal(oldData, &oldObj)
+	newErr := sigsyaml.Unmarshal(newData, &newObj)
+	if oldErr != nil || newErr != nil {
+		return unifiedDiff(string(oldData), string(newData), filepath.Base(oldPath), filepath.Base(newPath)), nil
+	}
 	if oldObj == nil {
 		oldObj = make(map[string]any)
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/compare.go` around lines 305 - 317, The YAML unmarshal currently
silences errors and replaces failed parses with empty maps (oldObj/newObj),
causing ComputeDiff to report no changes for two invalid files; instead, capture
the unmarshal errors from sigsyaml.Unmarshal for oldData and newData (e.g.,
errOld, errNew) and if either err is non-nil, bypass normalization and return a
raw text diff of oldData vs newData (use the existing raw diff helper or create
one) rather than proceeding with empty maps and calling ComputeDiff. Ensure you
reference and update the logic around oldObj, newObj and sigsyaml.Unmarshal in
compare.go so the raw-diff path is taken whenever an unmarshal fails.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/rdsdiff/compare.go`:
- Around line 140-166: Change unmarshalMultiDocYAML to return ([]map[string]any,
error) and stop silently skipping parse failures: inside unmarshalMultiDocYAML
(function name) return a descriptive error when sigsyaml.Unmarshal fails for any
non-empty document part instead of continue; keep the existing single-document
fallback but return its error if that also fails. Update the caller(s) such as
ExtractCRs to handle the new error return (propagate or wrap and return) so
malformed YAML causes a visible error rather than producing incomplete results.

In `@pkg/rdsdiff/download.go`:
- Around line 381-397: ValidateConfigurationRoot currently only checks existence
of sourceCRS and crsPath; update it to verify they are directories. After
calling os.Stat for sourceCRS and crsPath (variables sourceCRS and crsPath in
function ValidateConfigurationRoot), inspect the returned FileInfo.IsDir() and
return a clear error (mentioning SourceCRSPath or CRSPath and the full path) if
either is not a directory; keep existing error handling for os.IsNotExist and
other stat errors.
- Around line 118-129: The code currently calls io.ReadAll(resp.Body) after
client.Get(archiveURL), which can OOM on large/untrusted archives; replace
buffering with a bounded stream or disk-backed approach: either wrap resp.Body
with io.LimitReader using a defined maxArchiveSize constant and return an error
if the limit is hit, or stream the response directly to a temporary file via
os.CreateTemp and io.Copy (then use that file for extraction), and remove the
io.ReadAll usage; apply the same change to the second occurrence where
io.ReadAll(resp.Body) is used so both download paths (the resp, client.Get and
resp.Body usages) enforce the total archive-size cap before
buffering/extraction.

---

Duplicate comments:
In `@pkg/rdsdiff/compare.go`:
- Around line 305-317: The YAML unmarshal currently silences errors and replaces
failed parses with empty maps (oldObj/newObj), causing ComputeDiff to report no
changes for two invalid files; instead, capture the unmarshal errors from
sigsyaml.Unmarshal for oldData and newData (e.g., errOld, errNew) and if either
err is non-nil, bypass normalization and return a raw text diff of oldData vs
newData (use the existing raw diff helper or create one) rather than proceeding
with empty maps and calling ComputeDiff. Ensure you reference and update the
logic around oldObj, newObj and sigsyaml.Unmarshal in compare.go so the raw-diff
path is taken whenever an unmarshal fails.

In `@pkg/rdsdiff/download.go`:
- Around line 247-259: Before creating dstPath and copying bytes, validate the
archive member size against MaxExtractedFileSize and return an error if it
exceeds the limit instead of using io.LimitReader; specifically, check
f.UncompressedSize64 for ZIP entries and header.Size for TAR entries (the places
that currently call io.LimitReader when copying to dst in
pkg/rdsdiff/download.go), and fail early prior to os.Create(dstPath) so
truncated writes cannot occur.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 98d02ff0-432e-461c-9168-e5086d07375a

📥 Commits

Reviewing files that changed from the base of the PR and between a083dd6 and f2c4478.

📒 Files selected for processing (7)

pkg/rdsdiff/compare.go
pkg/rdsdiff/compare_test.go
pkg/rdsdiff/constants.go
pkg/rdsdiff/download.go
pkg/rdsdiff/download_test.go
pkg/rdsdiff/policygen.go
pkg/rdsdiff/storage.go

✅ Files skipped from review due to trivial changes (2)

pkg/rdsdiff/constants.go
pkg/rdsdiff/download_test.go

🚧 Files skipped from review as they are similar to previous changes (2)

pkg/rdsdiff/storage.go
pkg/rdsdiff/policygen.go

coderabbitai · 2026-04-16T08:38:44Z

+func unmarshalMultiDocYAML(data []byte) []map[string]any {
+	var docs []map[string]any
+	content := string(data)
+	// Strip leading document marker so splitting on "\n---" works for all cases
+	content = strings.TrimPrefix(content, "---\n")
+	content = strings.TrimPrefix(content, "---\r\n")
+	parts := strings.Split(content, "\n---")
+	for _, part := range parts {
+		part = strings.TrimSpace(part)
+		if part == "" {
+			continue
+		}
+		var doc map[string]any
+		if err := sigsyaml.Unmarshal([]byte(part), &doc); err != nil {
+			continue
+		}
+		if doc != nil {
+			docs = append(docs, doc)
+		}
+	}
+	if len(docs) == 0 {
+		var single map[string]any
+		if err := sigsyaml.Unmarshal(data, &single); err == nil && single != nil {
+			docs = []map[string]any{single}
+		}
+	}
+	return docs


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Locate the file and examine its structure find . -name "compare.go" -path "*/rdsdiff/*" | head -20

Repository: openshift/kube-compare

Length of output: 90

🏁 Script executed:

# Get an overview of the file wc -l pkg/rdsdiff/compare.go

Repository: openshift/kube-compare

Length of output: 92

🏁 Script executed:

# Examine the unmarshalMultiDocYAML function and surrounding context sed -n '130,180p' pkg/rdsdiff/compare.go

Repository: openshift/kube-compare

Length of output: 1342

🏁 Script executed:

# Search for callers of unmarshalMultiDocYAML rg "unmarshalMultiDocYAML" --type go -B 2 -A 2

Repository: openshift/kube-compare

Length of output: 549

🏁 Script executed:

# Examine ExtractCRs function to understand the flow rg "ExtractCRs" --type go -B 2 -A 10

Repository: openshift/kube-compare

Length of output: 10005

🏁 Script executed:

# Check the computeDiff function mentioned in the scratchpad rg "computeDiff" --type go -B 2 -A 15

Repository: openshift/kube-compare

Length of output: 3198

Return an error from unmarshalMultiDocYAML instead of silently skipping malformed documents.

When YAML parsing fails for a document, the function silently skips it with continue, causing ExtractCRs to succeed with incomplete results and produce false comparisons. Change the function signature to ([]map[string]any, error) and return an error when a document cannot be unmarshaled, then handle this error appropriately in the caller.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/rdsdiff/compare.go` around lines 140 - 166, Change unmarshalMultiDocYAML to return ([]map[string]any, error) and stop silently skipping parse failures: inside unmarshalMultiDocYAML (function name) return a descriptive error when sigsyaml.Unmarshal fails for any non-empty document part instead of continue; keep the existing single-document fallback but return its error if that also fails. Update the caller(s) such as ExtractCRs to handle the new error return (propagate or wrap and return) so malformed YAML causes a visible error rather than producing incomplete results.

coderabbitai · 2026-04-16T08:38:45Z

+	resp, err := client.Get(archiveURL) // nolint:noctx
+	if err != nil {
+		return "", fmt.Errorf("download %s: %w", archiveURL, err)
+	}
+	defer resp.Body.Close()
+	if resp.StatusCode != http.StatusOK {
+		return "", fmt.Errorf("download %s: status %s", archiveURL, resp.Status)
+	}
+	data, err := io.ReadAll(resp.Body)
+	if err != nil {
+		return "", fmt.Errorf("read archive: %w", err)
+	}


⚠️ Potential issue | 🟠 Major

Cap archive downloads before buffering them into memory.

Both download paths call io.ReadAll(resp.Body) on untrusted responses. A large archive can exhaust process memory before any per-file extraction limit applies. Please enforce a total archive-size cap before buffering, or stream to a temp file and extract from there.

Also applies to: 145-159

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/rdsdiff/download.go` around lines 118 - 129, The code currently calls io.ReadAll(resp.Body) after client.Get(archiveURL), which can OOM on large/untrusted archives; replace buffering with a bounded stream or disk-backed approach: either wrap resp.Body with io.LimitReader using a defined maxArchiveSize constant and return an error if the limit is hit, or stream the response directly to a temporary file via os.CreateTemp and io.Copy (then use that file for extraction), and remove the io.ReadAll usage; apply the same change to the second occurrence where io.ReadAll(resp.Body) is used so both download paths (the resp, client.Get and resp.Body usages) enforce the total archive-size cap before buffering/extraction.

coderabbitai · 2026-04-16T08:38:45Z

+// ValidateConfigurationRoot checks that source-crs and CRSPath exist under root.
+func ValidateConfigurationRoot(root string) error {
+	sourceCRS := filepath.Join(root, SourceCRSPath)
+	crsPath := filepath.Join(root, CRSPath)
+	if _, err := os.Stat(sourceCRS); err != nil {
+		if os.IsNotExist(err) {
+			return fmt.Errorf("configuration root missing %s (path: %s)", SourceCRSPath, sourceCRS)
+		}
+		return fmt.Errorf("stat %s: %w", sourceCRS, err)
+	}
+	if _, err := os.Stat(crsPath); err != nil {
+		if os.IsNotExist(err) {
+			return fmt.Errorf("configuration root missing %s (path: %s)", CRSPath, crsPath)
+		}
+		return fmt.Errorf("stat %s: %w", crsPath, err)
+	}
+	return nil


⚠️ Potential issue | 🟡 Minor

Validate directory shape, not just path existence.

ValidateConfigurationRoot currently passes if source-crs or argocd/example/acmpolicygenerator is a regular file. The later workflow expects directories, so this should check info.IsDir() and fail here with a targeted validation error.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/rdsdiff/download.go` around lines 381 - 397, ValidateConfigurationRoot currently only checks existence of sourceCRS and crsPath; update it to verify they are directories. After calling os.Stat for sourceCRS and crsPath (variables sourceCRS and crsPath in function ValidateConfigurationRoot), inspect the returned FileInfo.IsDir() and return a clear error (mentioning SourceCRSPath or CRSPath and the full path) if either is not a directory; keep existing error handling for os.IsNotExist and other stat errors.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

pkg/rdsdiff/compare.go (2)

146-172: ⚠️ Potential issue | 🟠 Major

Don't silently skip malformed YAML documents.

continue on Line 159 turns a bad policy file into a partial extraction, so the comparison can miss CRs without surfacing any error. unmarshalMultiDocYAML should return an error for any non-empty document that fails to decode, and ExtractCRs should propagate it.

Suggested fix

-func unmarshalMultiDocYAML(data []byte) []map[string]any {
+func unmarshalMultiDocYAML(data []byte) ([]map[string]any, error) {
 	var docs []map[string]any
 	content := string(data)
 	// Strip leading document marker so splitting on "\n---" works for all cases
 	content = strings.TrimPrefix(content, "---\n")
 	content = strings.TrimPrefix(content, "---\r\n")
 	parts := strings.Split(content, "\n---")
 	for _, part := range parts {
 		part = strings.TrimSpace(part)
 		if part == "" {
 			continue
 		}
 		var doc map[string]any
 		if err := sigsyaml.Unmarshal([]byte(part), &doc); err != nil {
-			continue
+			return nil, fmt.Errorf("unmarshal YAML document: %w", err)
 		}
 		if doc != nil {
 			docs = append(docs, doc)
 		}
 	}
 	if len(docs) == 0 {
 		var single map[string]any
-		if err := sigsyaml.Unmarshal(data, &single); err == nil && single != nil {
-			docs = []map[string]any{single}
+		if err := sigsyaml.Unmarshal(data, &single); err != nil {
+			return nil, fmt.Errorf("unmarshal YAML: %w", err)
+		}
+		if single != nil {
+			docs = []map[string]any{single}
 		}
 	}
-	return docs
+	return docs, nil
 }

-		docs := unmarshalMultiDocYAML(data)
+		docs, err := unmarshalMultiDocYAML(data)
+		if err != nil {
+			return fmt.Errorf("parse %s: %w", path, err)
+		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/compare.go` around lines 146 - 172, unmarshalMultiDocYAML
currently swallows YAML unmarshal errors (the continue after sigsyaml.Unmarshal)
which allows malformed documents to silently produce partial results; change
unmarshalMultiDocYAML to return ([]map[string]any, error) and on any non-empty
part that fails sigsyaml.Unmarshal return a descriptive error (including
part/index) instead of continuing, and update the caller ExtractCRs to accept
and propagate that error so malformed policy files surface as failures rather
than partial extractions.

311-332: ⚠️ Potential issue | 🟠 Major

Fall back to raw diff when YAML normalization fails.

If either file is invalid YAML, Lines 312-323 collapse that side to an empty map. When both parses fail, ComputeDiff returns "", which is a false "no diff" result for two malformed but different files.

Suggested fix

 	var oldObj, newObj map[string]any
-	if err := sigsyaml.Unmarshal(oldData, &oldObj); err != nil {
-		oldObj = nil
-	}
-	if err := sigsyaml.Unmarshal(newData, &newObj); err != nil {
-		newObj = nil
-	}
+	oldErr := sigsyaml.Unmarshal(oldData, &oldObj)
+	newErr := sigsyaml.Unmarshal(newData, &newObj)
+	if oldErr != nil || newErr != nil {
+		return unifiedDiff(string(oldData), string(newData), filepath.Base(oldPath), filepath.Base(newPath)), nil
+	}
 	if oldObj == nil {
 		oldObj = make(map[string]any)
 	}
 	if newObj == nil {
 		newObj = make(map[string]any)
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/compare.go` around lines 311 - 332, The current code drops a side
to an empty map when sigsyaml.Unmarshal fails causing two
malformed-but-different files to appear identical; change ComputeDiff (the block
using sigsyaml.Unmarshal, normalizeYAML and unifiedDiff) to detect any unmarshal
error and in that case skip YAML normalization and call unifiedDiff on the raw
file contents (convert oldData/newData to strings) instead of using empty maps;
keep using normalizeYAML/unifiedDiff only when both unmarshals succeed so that
malformed YAML falls back to a raw diff rather than producing an empty no-diff
result.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/rdsdiff/download_test.go`:
- Around line 14-32: ParseGitHubTreeURL currently splits on slashes and treats
parts[3] as the ref which breaks when refs contain '/' — update
ParseGitHubTreeURL (pkg/rdsdiff/download.go) to parse the path with a regex that
allows the ref to contain slashes and an optional trailing path, e.g. match
^/([^/]+)/([^/]+)/tree/([^/]+(?:/[^/]+)*)(?:/(.*))?$ and assign Owner=group1,
Repo=group2, Branch=group3, Path=group4 (empty if missing); keep ArchiveURL()
and TopLevelDir() logic but base them on the captured Branch. Ensure tests
(TestParseGitHubTreeURL_* in pkg/rdsdiff/download_test.go) cover refs with
slashes.
- Around line 88-106: The tests for SessionDir should assert the created
directory is actually inside and created: in TestSessionDir_CreatesUnderWorkDir,
resolve absolute paths for workDir and sessionDir (use filepath.Abs) and assert
sessionDir has workDir as a prefix (not just contains), then call
os.Stat(sessionDir) and assert it exists and is a directory; in
TestSessionDir_UsesTempWhenEmpty, after calling SessionDir("", "req-2") defer
os.RemoveAll(sessionDir), perform os.Stat(sessionDir) and assert it exists and
is a directory to ensure the temp directory was created; reference the
SessionDir function and the two test names when making the changes.

---

Duplicate comments:
In `@pkg/rdsdiff/compare.go`:
- Around line 146-172: unmarshalMultiDocYAML currently swallows YAML unmarshal
errors (the continue after sigsyaml.Unmarshal) which allows malformed documents
to silently produce partial results; change unmarshalMultiDocYAML to return
([]map[string]any, error) and on any non-empty part that fails
sigsyaml.Unmarshal return a descriptive error (including part/index) instead of
continuing, and update the caller ExtractCRs to accept and propagate that error
so malformed policy files surface as failures rather than partial extractions.
- Around line 311-332: The current code drops a side to an empty map when
sigsyaml.Unmarshal fails causing two malformed-but-different files to appear
identical; change ComputeDiff (the block using sigsyaml.Unmarshal, normalizeYAML
and unifiedDiff) to detect any unmarshal error and in that case skip YAML
normalization and call unifiedDiff on the raw file contents (convert
oldData/newData to strings) instead of using empty maps; keep using
normalizeYAML/unifiedDiff only when both unmarshals succeed so that malformed
YAML falls back to a raw diff rather than producing an empty no-diff result.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 058e8cca-0b06-4e73-81ee-d48bea882604

📥 Commits

Reviewing files that changed from the base of the PR and between f2c4478 and 1306b54.

📒 Files selected for processing (7)

pkg/rdsdiff/compare.go
pkg/rdsdiff/compare_test.go
pkg/rdsdiff/constants.go
pkg/rdsdiff/download.go
pkg/rdsdiff/download_test.go
pkg/rdsdiff/policygen.go
pkg/rdsdiff/storage.go

✅ Files skipped from review due to trivial changes (3)

pkg/rdsdiff/constants.go
pkg/rdsdiff/storage.go
pkg/rdsdiff/download.go

🚧 Files skipped from review as they are similar to previous changes (2)

pkg/rdsdiff/compare_test.go
pkg/rdsdiff/policygen.go

coderabbitai · 2026-04-16T10:27:50Z

+func TestParseGitHubTreeURL_FullURL(t *testing.T) {
+	u, err := ParseGitHubTreeURL("https://github.com/openshift-kni/telco-reference/tree/konflux-telco-core-rds-4-20/telco-ran/configuration")
+	require.NoError(t, err)
+	assert.Equal(t, "openshift-kni", u.Owner)
+	assert.Equal(t, "telco-reference", u.Repo)
+	assert.Equal(t, "konflux-telco-core-rds-4-20", u.Branch)
+	assert.Equal(t, "telco-ran/configuration", u.Path)
+	assert.Equal(t, "https://github.com/openshift-kni/telco-reference/archive/refs/heads/konflux-telco-core-rds-4-20.zip", u.ArchiveURL())
+	assert.Equal(t, "telco-reference-konflux-telco-core-rds-4-20", u.TopLevelDir())
+}
+
+func TestParseGitHubTreeURL_BranchOnly(t *testing.T) {
+	u, err := ParseGitHubTreeURL("https://github.com/org/repo/tree/main")
+	require.NoError(t, err)
+	assert.Equal(t, "org", u.Owner)
+	assert.Equal(t, "repo", u.Repo)
+	assert.Equal(t, "main", u.Branch)
+	assert.Empty(t, u.Path)
+}


⚠️ Potential issue | 🟠 Major

Add coverage for refs that contain /.

ParseGitHubTreeURL in pkg/rdsdiff/download.go:57-88 currently takes parts[3] as the ref, so a valid tree URL like /tree/release/4.20/telco-ran/configuration is misparsed as Branch=release and Path=4.20/.... This suite only covers slash-free refs, so that bug stays invisible.

Suggested test to expose the bug

func TestParseGitHubTreeURL_BranchOnly(t *testing.T) { u, err := ParseGitHubTreeURL("https://github.com/org/repo/tree/main") require.NoError(t, err) assert.Equal(t, "org", u.Owner) assert.Equal(t, "repo", u.Repo) assert.Equal(t, "main", u.Branch) assert.Empty(t, u.Path) } + +func TestParseGitHubTreeURL_BranchWithSlash(t *testing.T) { + u, err := ParseGitHubTreeURL("https://github.com/org/repo/tree/release/4.20/telco-ran/configuration") + require.NoError(t, err) + assert.Equal(t, "release/4.20", u.Branch) + assert.Equal(t, "telco-ran/configuration", u.Path) +}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func TestParseGitHubTreeURL_FullURL(t *testing.T) {

u, err := ParseGitHubTreeURL("https://github.com/openshift-kni/telco-reference/tree/konflux-telco-core-rds-4-20/telco-ran/configuration")

require.NoError(t, err)

assert.Equal(t, "openshift-kni", u.Owner)

assert.Equal(t, "telco-reference", u.Repo)

assert.Equal(t, "konflux-telco-core-rds-4-20", u.Branch)

assert.Equal(t, "telco-ran/configuration", u.Path)

assert.Equal(t, "https://github.com/openshift-kni/telco-reference/archive/refs/heads/konflux-telco-core-rds-4-20.zip", u.ArchiveURL())

assert.Equal(t, "telco-reference-konflux-telco-core-rds-4-20", u.TopLevelDir())

}

func TestParseGitHubTreeURL_BranchOnly(t *testing.T) {

u, err := ParseGitHubTreeURL("https://github.com/org/repo/tree/main")

require.NoError(t, err)

assert.Equal(t, "org", u.Owner)

assert.Equal(t, "repo", u.Repo)

assert.Equal(t, "main", u.Branch)

assert.Empty(t, u.Path)

}

func TestParseGitHubTreeURL_FullURL(t *testing.T) {

u, err := ParseGitHubTreeURL("https://github.com/openshift-kni/telco-reference/tree/konflux-telco-core-rds-4-20/telco-ran/configuration")

require.NoError(t, err)

assert.Equal(t, "openshift-kni", u.Owner)

assert.Equal(t, "telco-reference", u.Repo)

assert.Equal(t, "konflux-telco-core-rds-4-20", u.Branch)

assert.Equal(t, "telco-ran/configuration", u.Path)

assert.Equal(t, "https://github.com/openshift-kni/telco-reference/archive/refs/heads/konflux-telco-core-rds-4-20.zip", u.ArchiveURL())

assert.Equal(t, "telco-reference-konflux-telco-core-rds-4-20", u.TopLevelDir())

}

func TestParseGitHubTreeURL_BranchOnly(t *testing.T) {

u, err := ParseGitHubTreeURL("https://github.com/org/repo/tree/main")

require.NoError(t, err)

assert.Equal(t, "org", u.Owner)

assert.Equal(t, "repo", u.Repo)

assert.Equal(t, "main", u.Branch)

assert.Empty(t, u.Path)

}

func TestParseGitHubTreeURL_BranchWithSlash(t *testing.T) {

u, err := ParseGitHubTreeURL("https://github.com/org/repo/tree/release/4.20/telco-ran/configuration")

require.NoError(t, err)

assert.Equal(t, "release/4.20", u.Branch)

assert.Equal(t, "telco-ran/configuration", u.Path)

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/rdsdiff/download_test.go` around lines 14 - 32, ParseGitHubTreeURL currently splits on slashes and treats parts[3] as the ref which breaks when refs contain '/' — update ParseGitHubTreeURL (pkg/rdsdiff/download.go) to parse the path with a regex that allows the ref to contain slashes and an optional trailing path, e.g. match ^/([^/]+)/([^/]+)/tree/([^/]+(?:/[^/]+)*)(?:/(.*))?$ and assign Owner=group1, Repo=group2, Branch=group3, Path=group4 (empty if missing); keep ArchiveURL() and TopLevelDir() logic but base them on the captured Branch. Ensure tests (TestParseGitHubTreeURL_* in pkg/rdsdiff/download_test.go) cover refs with slashes.

coderabbitai · 2026-04-16T10:27:50Z

+func TestSessionDir_CreatesUnderWorkDir(t *testing.T) {
+	workDir := t.TempDir()
+	sessionDir, err := SessionDir(workDir, "req-1")
+	require.NoError(t, err)
+	assert.True(t, len(sessionDir) > len(workDir))
+	assert.Contains(t, sessionDir, "rds-diff-req-1")
+
+	info, err := os.Stat(sessionDir)
+	require.NoError(t, err)
+	assert.True(t, info.IsDir())
+}
+
+func TestSessionDir_UsesTempWhenEmpty(t *testing.T) {
+	sessionDir, err := SessionDir("", "req-2")
+	require.NoError(t, err)
+	defer os.RemoveAll(sessionDir)
+	assert.NotEmpty(t, sessionDir)
+	assert.Contains(t, sessionDir, "rds-diff-req-2")
+}


⚠️ Potential issue | 🟡 Minor

Strengthen the SessionDir assertions.

These tests mostly validate string shape. They would still pass if SessionDir returned a sibling path outside workDir, and the empty-workDir case never verifies that the directory was actually created.

Suggested assertion tightening

func TestSessionDir_CreatesUnderWorkDir(t *testing.T) { workDir := t.TempDir() sessionDir, err := SessionDir(workDir, "req-1") require.NoError(t, err) - assert.True(t, len(sessionDir) > len(workDir)) assert.Contains(t, sessionDir, "rds-diff-req-1") + rel, err := filepath.Rel(workDir, sessionDir) + require.NoError(t, err) + assert.NotContains(t, rel, "..") info, err := os.Stat(sessionDir) require.NoError(t, err) assert.True(t, info.IsDir()) } func TestSessionDir_UsesTempWhenEmpty(t *testing.T) { sessionDir, err := SessionDir("", "req-2") require.NoError(t, err) defer os.RemoveAll(sessionDir) assert.NotEmpty(t, sessionDir) assert.Contains(t, sessionDir, "rds-diff-req-2") + info, err := os.Stat(sessionDir) + require.NoError(t, err) + assert.True(t, info.IsDir()) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func TestSessionDir_CreatesUnderWorkDir(t *testing.T) {

workDir := t.TempDir()

sessionDir, err := SessionDir(workDir, "req-1")

require.NoError(t, err)

assert.True(t, len(sessionDir) > len(workDir))

assert.Contains(t, sessionDir, "rds-diff-req-1")

info, err := os.Stat(sessionDir)

require.NoError(t, err)

assert.True(t, info.IsDir())

}

func TestSessionDir_UsesTempWhenEmpty(t *testing.T) {

sessionDir, err := SessionDir("", "req-2")

require.NoError(t, err)

defer os.RemoveAll(sessionDir)

assert.NotEmpty(t, sessionDir)

assert.Contains(t, sessionDir, "rds-diff-req-2")

}

func TestSessionDir_CreatesUnderWorkDir(t *testing.T) {

workDir := t.TempDir()

sessionDir, err := SessionDir(workDir, "req-1")

require.NoError(t, err)

assert.Contains(t, sessionDir, "rds-diff-req-1")

rel, err := filepath.Rel(workDir, sessionDir)

require.NoError(t, err)

assert.NotContains(t, rel, "..")

info, err := os.Stat(sessionDir)

require.NoError(t, err)

assert.True(t, info.IsDir())

}

func TestSessionDir_UsesTempWhenEmpty(t *testing.T) {

sessionDir, err := SessionDir("", "req-2")

require.NoError(t, err)

defer os.RemoveAll(sessionDir)

assert.NotEmpty(t, sessionDir)

assert.Contains(t, sessionDir, "rds-diff-req-2")

info, err := os.Stat(sessionDir)

require.NoError(t, err)

assert.True(t, info.IsDir())

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/rdsdiff/download_test.go` around lines 88 - 106, The tests for SessionDir should assert the created directory is actually inside and created: in TestSessionDir_CreatesUnderWorkDir, resolve absolute paths for workDir and sessionDir (use filepath.Abs) and assert sessionDir has workDir as a prefix (not just contains), then call os.Stat(sessionDir) and assert it exists and is a directory; in TestSessionDir_UsesTempWhenEmpty, after calling SessionDir("", "req-2") defer os.RemoveAll(sessionDir), perform os.Stat(sessionDir) and assert it exists and is a directory to ensure the temp directory was created; reference the SessionDir function and the two test names when making the changes.

Add a reusable library for comparing two versions of a telco RDS (Reference Design Specification). Given two downloadable RDS sources (GitHub tree URLs or direct archive links), the library handles the full workflow: download and extract archives, run the PolicyGenerator binary to produce policies, extract individual CRs from the generated ACM Policy documents, normalize and diff the CRs, and produce both a human-readable text report and a structured JSON comparison summary. This package is intended to be consumed by tools such as kube-compare-mcp, which can wrap it as a thin MCP tool handler without embedding domain logic. New package: pkg/rdsdiff/ Source files: - constants.go: telco-ran path constants (CRSPath, SourceCRSPath) and the SNO policy file list used by PolicyGenerator. - download.go: GitHub tree URL parsing (ParseGitHubTreeURL), archive download and extraction for both zip and tar.gz formats, with zip-slip prevention and top-level directory detection. - policygen.go: PolicyGenerator binary wrapper (RunPolicyGen) that copies source-crs into the expected layout and runs the generator for each SNO policy file. - compare.go: CR extraction from ACM Policy documents, YAML normalization with sorted keys, key correlation across versions, line-level unified diff using diffmatchpatch, and report/JSON generation (Run, RunCompare, ExtractCRs). - storage.go: session directory management with path traversal protection (SessionDir, SessionID, ResolveSessionPath). Test files: - compare_test.go: 24 tests covering CR extraction, normalization, correlation, diff computation, and end-to-end report generation. - download_test.go: 7 tests covering URL parsing, validation, download error handling, and session directory creation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Talor Itzhak <titzhak@redhat.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

pkg/rdsdiff/compare_test.go (1)

430-435: Tighten the ListOfCRsForSNO contract check.

assert.Contains(..., "acm-common-ranGen.yaml") plus assert.Len(..., 4) is a weak signal: it can miss wrong replacements for the other entries, and it will also fail on a harmless addition. Prefer asserting the exact expected set if this list is part of the contract, or only the specific invariant you care about if the list is expected to evolve.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/compare_test.go` around lines 430 - 435, TestConstants currently
checks ListOfCRsForSNO with assert.Contains and assert.Len which is brittle;
update the test to assert the exact expected entries (or use
assert.ElementsMatch to check set equality) against ListOfCRsForSNO so any
unexpected changes are caught. Modify the TestConstants function to replace the
assert.Contains and assert.Len checks on ListOfCRsForSNO with a single assertion
that compares ListOfCRsForSNO to the expected slice (or uses ElementsMatch if
order is not important).

pkg/rdsdiff/download.go (1)

179-186: Consider preserving extraction errors in the fallback path.

When format detection fails and both extractZip and extractTarGz are attempted, their error messages are discarded. The final error only mentions Content-Type, which may not help diagnose malformed archives.

♻️ Suggested improvement

-	if effectiveRoot, err := extractZip(data, destDir); err == nil {
+	zipErr := error(nil)
+	tarErr := error(nil)
+	if effectiveRoot, zipErr := extractZip(data, destDir); zipErr == nil {
 		return effectiveRoot, nil
 	}
-	if effectiveRoot, err := extractTarGz(data, destDir); err == nil {
+	if effectiveRoot, tarErr := extractTarGz(data, destDir); tarErr == nil {
 		return effectiveRoot, nil
 	}
-	return "", fmt.Errorf("unsupported archive format (expected zip or tar.gz); Content-Type: %q", ct)
+	return "", fmt.Errorf("unsupported archive format (expected zip or tar.gz); Content-Type: %q; zip: %v; tar.gz: %v", ct, zipErr, tarErr)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/rdsdiff/download.go` around lines 179 - 186, The fallback path discards
errors from extractZip and extractTarGz, losing useful diagnostic info; update
the logic in the function containing the current snippet to capture both errors
(from extractZip and extractTarGz), and include them in the returned error
(e.g., wrap or concatenate the two errors along with the Content-Type) so
callers see why each extractor failed; reference extractZip and extractTarGz to
locate the failing calls and modify the final fmt.Errorf to include those error
details.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/rdsdiff/compare_test.go`:
- Around line 411-428: The test TestRunCompare_ExtractsAndWritesArtifacts
currently only checks artifact existence; update it to assert that RunCompare
actually extracted per-CR outputs by verifying either that specific extracted CR
files exist under sessionDir (e.g. files named like
ConfigMap_openshift-monitoring_my-config.yaml and
ImageContentSourcePolicy_my-icsp.yaml) or by reading and unmarshalling
comparison.json and asserting it contains the expected CR keys (e.g.
"ConfigMap_openshift-monitoring_my-config" and
"ImageContentSourcePolicy_my-icsp"); ensure you reference RunCompare, sessionDir
and comparison.json when adding these assertions so the end-to-end extraction
path is covered.

In `@pkg/rdsdiff/storage.go`:
- Around line 16-30: The SessionDir function constructs a directory name using
the raw requestID which can include path separators or special entries and cause
nested or unexpected directories; update SessionDir to validate/sanitize
requestID (similar to ResolveSessionPath) by rejecting empty values, any path
separators ('/' or filepath.Separator, backslash), and path traversal tokens
like "..", returning an error if invalid, before formatting the session name and
creating the directory so only a single session folder is created.

---

Nitpick comments:
In `@pkg/rdsdiff/compare_test.go`:
- Around line 430-435: TestConstants currently checks ListOfCRsForSNO with
assert.Contains and assert.Len which is brittle; update the test to assert the
exact expected entries (or use assert.ElementsMatch to check set equality)
against ListOfCRsForSNO so any unexpected changes are caught. Modify the
TestConstants function to replace the assert.Contains and assert.Len checks on
ListOfCRsForSNO with a single assertion that compares ListOfCRsForSNO to the
expected slice (or uses ElementsMatch if order is not important).

In `@pkg/rdsdiff/download.go`:
- Around line 179-186: The fallback path discards errors from extractZip and
extractTarGz, losing useful diagnostic info; update the logic in the function
containing the current snippet to capture both errors (from extractZip and
extractTarGz), and include them in the returned error (e.g., wrap or concatenate
the two errors along with the Content-Type) so callers see why each extractor
failed; reference extractZip and extractTarGz to locate the failing calls and
modify the final fmt.Errorf to include those error details.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 70668fce-ec32-4cbe-9ad0-1f7d02fefa5d

📥 Commits

Reviewing files that changed from the base of the PR and between 1306b54 and 72c0b75.

📒 Files selected for processing (8)

pkg/rdsdiff/compare.go
pkg/rdsdiff/compare_test.go
pkg/rdsdiff/constants.go
pkg/rdsdiff/download.go
pkg/rdsdiff/download_test.go
pkg/rdsdiff/fileio.go
pkg/rdsdiff/policygen.go
pkg/rdsdiff/storage.go

✅ Files skipped from review due to trivial changes (3)

pkg/rdsdiff/fileio.go
pkg/rdsdiff/constants.go
pkg/rdsdiff/download_test.go

🚧 Files skipped from review as they are similar to previous changes (2)

pkg/rdsdiff/policygen.go
pkg/rdsdiff/compare.go

coderabbitai · 2026-04-16T11:57:02Z

+func TestRunCompare_ExtractsAndWritesArtifacts(t *testing.T) {
+	root := t.TempDir()
+	oldGen := filepath.Join(root, "old-gen")
+	newGen := filepath.Join(root, "new-gen")
+	require.NoError(t, os.MkdirAll(oldGen, 0o750))
+	require.NoError(t, os.MkdirAll(newGen, 0o750))
+	require.NoError(t, os.WriteFile(filepath.Join(oldGen, "p.yaml"), []byte(minimalPolicyYAML), 0o600))
+	require.NoError(t, os.WriteFile(filepath.Join(newGen, "p.yaml"), []byte(minimalPolicyYAML), 0o600))
+
+	sessionDir := filepath.Join(root, "session")
+	require.NoError(t, os.MkdirAll(sessionDir, 0o750))
+
+	summary, err := RunCompare(oldGen, newGen, sessionDir)
+	require.NoError(t, err)
+	assert.Contains(t, summary, "Comparison summary")
+	assert.FileExists(t, filepath.Join(sessionDir, "diff-report.txt"))
+	assert.FileExists(t, filepath.Join(sessionDir, "comparison.json"))
+}


⚠️ Potential issue | 🟠 Major

Assert the extracted CR outputs here, not just artifact existence.

This test can still pass if RunCompare skips the extraction step and compares the raw generated p.yaml files directly. Please assert at least one extracted per-CR file under sessionDir or inspect comparison.json for the expected CR keys (for example ConfigMap_openshift-monitoring_my-config and ImageContentSourcePolicy_my-icsp) so the end-to-end path is actually covered.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/rdsdiff/compare_test.go` around lines 411 - 428, The test TestRunCompare_ExtractsAndWritesArtifacts currently only checks artifact existence; update it to assert that RunCompare actually extracted per-CR outputs by verifying either that specific extracted CR files exist under sessionDir (e.g. files named like ConfigMap_openshift-monitoring_my-config.yaml and ImageContentSourcePolicy_my-icsp.yaml) or by reading and unmarshalling comparison.json and asserting it contains the expected CR keys (e.g. "ConfigMap_openshift-monitoring_my-config" and "ImageContentSourcePolicy_my-icsp"); ensure you reference RunCompare, sessionDir and comparison.json when adding these assertions so the end-to-end extraction path is covered.

coderabbitai · 2026-04-16T11:57:02Z

+func SessionDir(workDir, requestID string) (string, error) {
+	if workDir == "" {
+		workDir = os.TempDir()
+	}
+	workDir = filepath.Clean(workDir)
+	if err := os.MkdirAll(workDir, 0o750); err != nil {
+		return "", fmt.Errorf("mkdir work dir: %w", err)
+	}
+	name := fmt.Sprintf("rds-diff-%s-%d", requestID, time.Now().Unix())
+	sessionPath := filepath.Join(workDir, name)
+	if err := os.MkdirAll(sessionPath, 0o750); err != nil {
+		return "", fmt.Errorf("mkdir session: %w", err)
+	}
+	return sessionPath, nil
+}


⚠️ Potential issue | 🟡 Minor

Unsanitized requestID can create unexpected directory structures.

requestID is used directly in the session directory name without validation. If it contains path separators (e.g., foo/bar) or special characters, os.MkdirAll will create nested directories rather than a single session folder. This inconsistency with the strict validation in ResolveSessionPath could cause confusion or allow unintended directory layouts.

Consider applying similar validation (reject /, \, .., or empty values) to requestID before constructing the session name.

🛡️ Proposed fix

func SessionDir(workDir, requestID string) (string, error) { if workDir == "" { workDir = os.TempDir() } workDir = filepath.Clean(workDir) + requestID = strings.TrimSpace(requestID) + if requestID == "" || strings.ContainsAny(requestID, `/\`) || strings.Contains(requestID, "..") { + return "", errors.New("invalid request ID") + } if err := os.MkdirAll(workDir, 0o750); err != nil { return "", fmt.Errorf("mkdir work dir: %w", err) }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/rdsdiff/storage.go` around lines 16 - 30, The SessionDir function constructs a directory name using the raw requestID which can include path separators or special entries and cause nested or unexpected directories; update SessionDir to validate/sanitize requestID (similar to ResolveSessionPath) by rejecting empty values, any path separators ('/' or filepath.Separator, backslash), and path traversal tokens like "..", returning an error if invalid, before formatting the session name and creating the directory so only a single session folder is created.

openshift-ci · 2026-04-16T12:08:47Z

@Tal-or: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

lack · 2026-04-16T12:15:54Z

For a new feature like this, can we add a markdown document to the docs/ showing how and why this will be used, maybe with examples?

Tal-or · 2026-04-16T12:18:14Z

For a new feature like this, can we add a markdown document to the docs/ showing how and why this will be used, maybe with examples?

Sure thing, I'll add that, Thanks!

lack · 2026-04-16T14:07:20Z

The concern here is that we don't want to introduce anything telco-specific (or RDS-specific, since that it telco-specific), as this is intended to be a generic tool.

If this is a generic feature, a document explaining its purpose, examples, and how it works with the existing features of this tool would be a great way to understand it.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 15, 2026

openshift-ci bot requested review from nocturnalastro and pixelsoccupied April 15, 2026 11:55

coderabbitai bot reviewed Apr 15, 2026

View reviewed changes

Comment thread pkg/rdsdiff/compare.go Outdated

Comment thread pkg/rdsdiff/download.go Outdated

Comment thread pkg/rdsdiff/storage.go Outdated

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

Tal-or force-pushed the rds-diff-library branch from a083dd6 to f2c4478 Compare April 16, 2026 08:31

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

Tal-or force-pushed the rds-diff-library branch from f2c4478 to 1306b54 Compare April 16, 2026 10:21

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

Tal-or force-pushed the rds-diff-library branch from 1306b54 to 72c0b75 Compare April 16, 2026 11:51

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

Conversation

Tal-or commented Apr 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci bot commented Apr 15, 2026

Uh oh!

coderabbitai bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Apr 16, 2026

Uh oh!

lack commented Apr 16, 2026

Uh oh!

Tal-or commented Apr 16, 2026

Uh oh!

lack commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Tal-or commented Apr 15, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 15, 2026 •

edited

Loading