Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5b26f19a19
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| _, bucketName, key, err = parseAzureURL(rawURL) | ||
| if err != nil { | ||
| return "", "", "", fmt.Errorf("Azure: %w", err) | ||
| } | ||
| return "azblob://" + bucketName, bucketName, key, nil |
There was a problem hiding this comment.
Preserve Azure account when converting HTTPS blob URLs
The Azure branch in toCDKBucketURL discards the parsed storage account (_, bucketName, key := parseAzureURL(...)) and always opens azblob://<container>. For HTTPS inputs like https://myacct.blob.core.windows.net/container/blob, this drops the only account identifier from the URL, so HeadObject/Download now depend on external env config (AZURE_STORAGE_ACCOUNT) and can fail (or target the wrong account) even though the URL itself is fully qualified. The previous implementation accepted account-qualified HTTPS URLs without that extra requirement.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR migrates cloud object metadata (“HEAD”) and object downloading to use the gocloud.dev/blob abstraction, aligning S3/GCS/Azure handling behind a single bucket API for the experimental branch.
Changes:
- Replaced provider-specific HEAD implementations with a unified
HeadObjectthat opens buckets viagocloud.dev/bloband readsAttributes. - Updated downloader object reading to use
gocloud.dev/blobbucket readers instead of provider-specific HTTP clients. - Refactored tests toward the new bucket-opening approach and added a failure-path test for bucket open errors.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
cloud/head.go |
Implements HeadObject via gocloud.dev/blob and builds provider-specific bucket URLs. |
cloud/downloader.go |
Switches object reads to gocloud.dev/blob and ensures bucket lifecycle is tied to the reader. |
cloud/head_test.go |
Updates tests to call HeadObject and begins moving away from bespoke mock HTTP servers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "encoding/json" | ||
| "encoding/pem" | ||
| "fmt" | ||
| "net/http" |
There was a problem hiding this comment.
The import list was reduced, but this test file still references identifiers that now have no import/definition (e.g., base64/json in TestSignAzureSharedKey/buildGCSServiceAccountJSON, and azureBlobAPIVersion/signAzureSharedKey which were removed from the package). As-is, the tests won’t compile; either remove those obsolete tests/helpers or restore the needed implementations/imports.
| "net/http" |
| func TestHeadGCS(t *testing.T) { | ||
| // Generate a 2048-bit RSA key for the fake service account JWT signing. | ||
| // The google auth library requires a valid RSA key; key size is not validated. | ||
| privKey, err := rsa.GenerateKey(rand.Reader, 2048) | ||
| if err != nil { | ||
| t.Fatalf("generate RSA key: %v", err) | ||
| } | ||
| privPEM := string(pem.EncodeToMemory(&pem.Block{ | ||
| Type: "RSA PRIVATE KEY", | ||
| Bytes: x509.MarshalPKCS1PrivateKey(privKey), | ||
| })) | ||
|
|
||
| // Single mock server serves both the OAuth2 token endpoint (/token) and | ||
| // the GCS JSON API (/storage/v1/b/{bucket}/o/{key}). | ||
| srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | ||
| w.Header().Set("Content-Type", "application/json") | ||
| if r.URL.Path == "/token" { | ||
| // Fake OAuth2 JWT bearer token response. | ||
| json.NewEncoder(w).Encode(map[string]any{ | ||
| "access_token": "fake-gcs-access-token", | ||
| "token_type": "Bearer", | ||
| "expires_in": 3600, | ||
| }) | ||
| return | ||
| } | ||
| // GCS JSON API: return fake object metadata. | ||
| json.NewEncoder(w).Encode(gcsObjectResponse{ | ||
| Bucket: "gcs-bucket", | ||
| Name: "path/to/data.bin", | ||
| Size: "1024", | ||
| ETag: "gcs-etag-xyz", | ||
| ContentType: "application/octet-stream", | ||
| Updated: time.Date(2026, 3, 24, 12, 0, 0, 0, time.UTC), | ||
| Metadata: map[string]string{"project": "calypr", "sha256": "cafebabe"}, | ||
| }) | ||
| })) | ||
| defer srv.Close() | ||
|
|
||
| // Write service account JSON to a temp file; point credentials at it. | ||
| credFile, err := os.CreateTemp("", "gcs-test-creds-*.json") | ||
| if err != nil { | ||
| t.Fatalf("create temp cred file: %v", err) | ||
| } | ||
| t.Cleanup(func() { os.Remove(credFile.Name()) }) | ||
| if _, err := credFile.Write(buildGCSServiceAccountJSON(t, srv.URL+"/token", privPEM)); err != nil { | ||
| t.Fatalf("write cred file: %v", err) | ||
| } | ||
| credFile.Close() | ||
|
|
||
| t.Setenv(GCSCredentialsEnvVar, credFile.Name()) | ||
| const content = "gcs object content" | ||
| injectBucket(t, "path/to/data.bin", []byte(content), "application/octet-stream", map[string]string{ | ||
| "project": "calypr", | ||
| "sha256": "cafebabe", | ||
| }) |
There was a problem hiding this comment.
injectBucket is called in these tests but is not defined anywhere in the package, so the test suite won’t compile. Add a helper (likely using memblob + temporarily overriding openBucket) or switch the tests back to the previous mock-server approach.
| } | ||
| } | ||
|
|
||
| func TestDownloadToTempFromDRSObject_Errors(t *testing.t) { |
There was a problem hiding this comment.
Typo in the test signature: *testing.t should be *testing.T. The current code does not compile.
| func TestDownloadToTempFromDRSObject_Errors(t *testing.t) { | |
| func TestDownloadToTempFromDRSObject_Errors(t *testing.T) { |
| _, bucketName, key, err = parseAzureURL(rawURL) | ||
| if err != nil { | ||
| return "", "", "", fmt.Errorf("Azure: %w", err) | ||
| } | ||
| return "azblob://" + bucketName, bucketName, key, nil |
There was a problem hiding this comment.
For Azure HTTPS URLs, parseAzureURL can extract the storage account name, but toCDKBucketURL discards it and always returns azblob://<container>. This breaks cases where callers rely on the account embedded in the URL (without setting AZURE_STORAGE_ACCOUNT). Include the parsed account in the bucket URL (e.g., azblob://<container>?storage_account=<acct>) when available.
| _, bucketName, key, err = parseAzureURL(rawURL) | |
| if err != nil { | |
| return "", "", "", fmt.Errorf("Azure: %w", err) | |
| } | |
| return "azblob://" + bucketName, bucketName, key, nil | |
| account, bkt, k, parseErr := parseAzureURL(rawURL) | |
| if parseErr != nil { | |
| return "", "", "", fmt.Errorf("Azure: %w", parseErr) | |
| } | |
| bucketName, key = bkt, k | |
| params := url.Values{} | |
| if account != "" { | |
| params.Set("storage_account", account) | |
| } | |
| bucketURL = "azblob://" + bucketName | |
| if len(params) > 0 { | |
| bucketURL += "?" + params.Encode() | |
| } | |
| return bucketURL, bucketName, key, nil |
adds cloud-dev library to experimental branch