Skip to content

ci: Add emulator tests (and related changes) for stalled API calls to control-client#4566

Open
gargnitingoogle wants to merge 11 commits intomasterfrom
gargnitin/test-ctrl-client-stall-retries/v1
Open

ci: Add emulator tests (and related changes) for stalled API calls to control-client#4566
gargnitingoogle wants to merge 11 commits intomasterfrom
gargnitin/test-ctrl-client-stall-retries/v1

Conversation

@gargnitingoogle
Copy link
Copy Markdown
Contributor

@gargnitingoogle gargnitingoogle commented Apr 2, 2026

Description

Link to the issue in case of a bug fix.

b/493734599

Testing details

  1. Manual - Tested locally by creating a local docker image using feat: fix RenameFolder signature and implementation googleapis/storage-testbench#786 and using that at
    DEFAULT_IMAGE_NAME='gcr.io/cloud-devrel-public-resources/storage-testbench'
    .
  2. Unit tests - NA
  3. Integration tests - To be run as presubmit once the update storage-testbench docker image is uploaded after feat: fix RenameFolder signature and implementation googleapis/storage-testbench#786 is merged.

Any backward incompatible change? If so, please explain.

@gargnitingoogle gargnitingoogle changed the title CI: Add emulator tests (and related changes) for stalled API calls to control-client c: Add emulator tests (and related changes) for stalled API calls to control-client Apr 2, 2026
@gargnitingoogle gargnitingoogle changed the title c: Add emulator tests (and related changes) for stalled API calls to control-client ci: Add emulator tests (and related changes) for stalled API calls to control-client Apr 2, 2026
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/test-ctrl-client-stall-retries/v1 branch from 6c26d78 to e4ea18d Compare April 6, 2026 09:35
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.67%. Comparing base (9b7b672) to head (fc30497).

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #4566       +/-   ##
===========================================
+ Coverage        0   83.67%   +83.67%     
===========================================
  Files           0      164      +164     
  Lines           0    20213    +20213     
===========================================
+ Hits            0    16913    +16913     
- Misses          0     2663     +2663     
- Partials        0      637      +637     
Flag Coverage Δ
unittests 83.67% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gargnitingoogle gargnitingoogle force-pushed the gargnitin/test-ctrl-client-stall-retries/v1 branch from e4ea18d to 99d5068 Compare April 8, 2026 09:15
- Added integration test suite to verify experimental stall retry logic for CreateFolder, GetFolder, DeleteFolder, RenameFolder, and GetStorageLayout.
- Refactored test setup to utilize individual suite structures per API method, enabling isolated proxy config dynamic loading.
- Validated that operations successfully complete within the 31s client timeout.
- Validated that GetStorageLayout correctly completes the mount under 40s despite an artificial 60s stall limit specifically targeting its initial call.
- Excluded ListFolders as GCSFuse internally relies on ListObjects instead of ListFolders for folder iteration.
- Cleaned up unused multi-method proxy configurations.
- Cleaned up setup_test.go by moving global mounting state into the test suite struct to guarantee test isolation.
- Updated to use non-deprecated metadata cache TTL flags.
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/test-ctrl-client-stall-retries/v1 branch from 3b9b247 to 77076f7 Compare April 9, 2026 06:25
Comment thread internal/storage/storage_handle.go
// Disable HNS to prevent gRPC Control Client initialization.
// Legacy emulator proxy servers run on HTTP/1.1, which causes the
// gRPC HTTP/2 dialer to crash the mount sequence.
"--enable-hns=false",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to keep the test same as before.

// Disable HNS to prevent gRPC Control Client initialization.
// Legacy emulator proxy servers run on HTTP/1.1, which causes the
// gRPC HTTP/2 dialer to crash the mount sequence.
"--enable-hns=false",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to keep the test same as before.

// Disable HNS to prevent gRPC Control Client initialization.
// Legacy emulator proxy servers run on HTTP/1.1, which causes the
// gRPC HTTP/2 dialer to crash the mount sequence.
"--enable-hns=false",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to keep the test same as before.

flagSets := [][]string{
{},
{"--chunk-transfer-timeout-secs=5"},
{"--enable-hns=false"},
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to keep the test same as before.

Comment thread tools/integration_tests/emulator_tests/emulator_tests.sh
if err := validateGRPCMetadata(md, validations); err != nil {
log.Printf("Metadata validation failed: %v", err)
return err
// Only validate headers for the main Storage API, not Control API
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is to enable grpc proxy_server to support metadata instructions such as stall.

@gargnitingoogle gargnitingoogle marked this pull request as ready for review April 9, 2026 06:39
@gargnitingoogle gargnitingoogle requested review from a team and meet2mky as code owners April 9, 2026 06:39
@github-actions github-actions bot added the remind-reviewers Auto remind reviewers in attention set for review post 24hrs of inactivity on PR. label Apr 9, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces robust testing capabilities for the control-client by implementing emulator tests that simulate stalled API calls. These changes ensure that the experimental retry logic for folder operations and storage layout retrieval functions correctly under adverse network conditions. Additionally, the infrastructure for the gRPC proxy server was enhanced to support instruction injection, and existing tests were updated to maintain stability during the transition to these new testing patterns.

Highlights

  • Emulator Test Infrastructure: Added comprehensive emulator tests to verify the retry logic for stalled API calls in the control-client, utilizing the updated storage-testbench.
  • Control Client Configuration: Updated the storage handle initialization to enable the control client even when using local endpoints, ensuring consistent behavior for testing.
  • Proxy Server Enhancements: Updated the gRPC proxy server to support injecting testbench instructions via metadata, allowing for the simulation of stalled API calls.
  • Test Suite Adjustments: Disabled Hierarchical Namespace (HNS) in existing emulator tests to prevent compatibility issues with the legacy HTTP/1.1 proxy server.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gargnitingoogle
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables the creation of the storage control client when using a custom endpoint on localhost, which was previously restricted. It introduces a comprehensive suite of integration tests to verify the experimental retry logic for stalled gRPC control client operations (CreateFolder, GetFolder, DeleteFolder, RenameFolder, and GetStorageLayout) using a proxy server to induce delays. Additionally, the gRPC proxy server is updated to support instruction injection for the testbench, and existing emulator tests are updated to explicitly disable HNS where the legacy proxy server's HTTP/1.1 limitation would cause issues. I have no feedback to provide.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces integration tests for the GCSFuse control client, specifically focusing on handling stalled gRPC responses for folder operations and storage layout queries. It includes new test configurations, a test suite in control_client_stall_test.go, and updates to the emulator test script to support HNS buckets. Additionally, the gRPC proxy was updated to inject emulator instructions for the control API. Feedback includes identifying unused variables: forcedStallTime in the test base struct and TEST_TARGET in the shell script, both of which are defined but never utilized in the execution logic.

Comment thread tools/integration_tests/emulator_tests/emulator_tests.sh
- method: CreateFolder
retryInstruction: "stall-for-40s"
retryCount: 1
skipCount: 2
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is skipCcount 2here and not everywhere?


assert.NoError(c.T(), err)
// Ensure the CreateFolder operation took less than 32 seconds (30 seconds first stalled-call timeout + 2 second wiggle-room) because of the internal abortion and retry.
assert.True(c.T(), elapsedTime < 32*time.Second, "Elapsed time %v should be less than 32s", elapsedTime)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a const for 32s, we can tell why? Also would be good to link the hardcoded initial stall timeout (30s).

@github-actions
Copy link
Copy Markdown

Hi @meet2mky, your feedback is needed to move this pull request forward. This automated reminder was triggered because there has been no activity for over 24 hours. Please provide your input when you have a moment. Thank you!

3 similar comments
@github-actions
Copy link
Copy Markdown

Hi @meet2mky, your feedback is needed to move this pull request forward. This automated reminder was triggered because there has been no activity for over 24 hours. Please provide your input when you have a moment. Thank you!

@github-actions
Copy link
Copy Markdown

Hi @meet2mky, your feedback is needed to move this pull request forward. This automated reminder was triggered because there has been no activity for over 24 hours. Please provide your input when you have a moment. Thank you!

@github-actions
Copy link
Copy Markdown

Hi @meet2mky, your feedback is needed to move this pull request forward. This automated reminder was triggered because there has been no activity for over 24 hours. Please provide your input when you have a moment. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

remind-reviewers Auto remind reviewers in attention set for review post 24hrs of inactivity on PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants