cdn_bench: multi-instance parser and scientific notation fix#577
Open
SamirFarhat17 wants to merge 3 commits intofacebookresearch:v2-betafrom
Open
cdn_bench: multi-instance parser and scientific notation fix#577SamirFarhat17 wants to merge 3 commits intofacebookresearch:v2-betafrom
SamirFarhat17 wants to merge 3 commits intofacebookresearch:v2-betafrom
Conversation
…#575) Summary: Added health checks to cdn_bench/run.sh so it fails fast with a clear message instead of silently misbehaving: - verify_content_server() probes each content_server with curl after startup to confirm it's actually serving, not just listening - Backend Reachability Check runs before starting proxies — if any backend is unreachable it prints the exact curl command to debug and aborts - Fixed IPv6 host:port parsing (was using ${entry%:*} which strips everything after the first colon, now extracts port first then strips it) Reviewed By: YifanYuan3 Differential Revision: D100220256
…facebookresearch#576) Summary: Follow-up improvements to cdn_bench run.sh for operational ergonomics. - Graceful proxy shutdown — sends SIGINT instead of SIGTERM so proxygen flushes its metrics summary before exit, with a 2s grace period before SIGKILL - Auto-terminate for server and proxy roles when -d (duration) is passed, so long-running roles exit automatically after the client finishes (server: +20s grace, proxy: +10s grace) - Client-side proxy reachability check — verifies all proxy targets respond (HTTP/1.1 and h2) before sending traffic, aborts with diagnostic info if unreachable - Proxy stderr tee — proxy_server stderr is now tee'd to both terminal and file so metrics are visible during the run - Changed metrics_interval from 0 to 5 for periodic metrics output during the run - Minor quoting fix for IPv6 host:port variable expansion Differential Revision: D100630922
Summary:
Two fixes found while running multi-proxy CDN benchmarks on Gen10+/Gen11 edge hosts:
**cdn_bench.py parser — multi-instance support:**
- Section headers now match with startswith() instead of exact ==, so 'Client Results (instance 0, target ...)' is correctly detected
- Per-instance metrics stored as client_1_requests_sent, proxy_2_actual_rps etc
- Aggregate totals accumulated across instances (client_requests_sent = sum of all instances)
- Computed aggregate success/error rates and instance counts
```
"metrics": {
"client_instances": 0,
"exit_code": 0,
"protocol": "h1",
"proxy_1_actual_rps": 4000.0,
"proxy_1_avg_backend_latency_ms": 176.534,
"proxy_1_avg_latency_ms": 177.0,
"proxy_1_requests_failed": 0,
"proxy_1_requests_received": 280356,
"proxy_1_requests_succeeded": 280356,
"proxy_1_retries_attempted": 0,
"proxy_1_retries_succeeded": 0,
"proxy_1_success_rate_pct": 100.0,
"proxy_2_actual_rps": 4000.0,
"proxy_2_avg_backend_latency_ms": 175.723,
"proxy_2_avg_latency_ms": 176.0,
"proxy_2_requests_failed": 0,
"proxy_2_requests_received": 285273,
"proxy_2_requests_succeeded": 285273,
"proxy_2_retries_attempted": 0,
"proxy_2_retries_succeeded": 0,
"proxy_2_success_rate_pct": 100.0,
"proxy_actual_rps": 8000.0,
"proxy_instances": 2,
"proxy_requests_failed": 0,
"proxy_requests_received": 565629,
"proxy_requests_succeeded": 565629,
"proxy_success_rate_pct": 100.0
},
```
**run.sh — scientific notation handling:**
- Proxy metrics regex updated to match scientific notation (e.g. 5e+03 was parsed as just 5)
- printf formatting for Success Rate (%.2f) and Actual RPS (%.1f) so 1e+02 displays as 100.00%
```
Proxy Results (instance 0, port 8081)
Requests Received: 276720
Requests Succeeded: 276719
Requests Failed: 1
Success Rate: 100.00%
Actual RPS: 4000.0
Avg Total Latency ms: 180
Avg Backend Latency ms: 180.095
Retries Attempted: 0
Retries Succeeded: 0
Proxy Role Complete
Cleaning up processes...
stderr:
Results Report:
{
"benchmark_args": [
"-m proxy",
"-B 2401:db00:f01b:301d:face:0:18f:0",
"-b 8080",
"-P 8081",
"-p h1"
],
"benchmark_desc": "Distributed CDN benchmark. Run server, proxy, and client roles on separate hosts. Supports multiple instances of each role for scaling.\n",
"benchmark_hooks": [
"perf: {'perfstat': {'interval': 1}, 'mpstat': {'interval': 1}, 'netstat': {'interval': 1}}",
"copymove: {'is_move': True, 'after': ['packages/cdn_bench/cdn_bench_run.log']}"
],
"benchmark_name": "cdn_bench",
"machines": [
{
"cpu_architecture": "x86_64",
"cpu_model": "INTEL(R) XEON(R) PLATINUM 8558P",
"hostname": "fnedge932.01.ams2.facebook.com",
"kernel_version": "6.4.3-0_fbk15_hardened_2630_gf27365f948db",
"mem_total_kib": "525701232 KiB",
"num_cpus_usable": 96,
"num_logical_cpus": "96",
"os_distro": "centos",
"os_release_name": "CentOS Stream 9",
"threads_per_core": "2"
}
],
"metadata": {
"L1d cache": "2.3 MiB (48 instances)",
"L1i cache": "1.5 MiB (48 instances)",
"L2 cache": "96 MiB (48 instances)",
"L3 cache": "260 MiB (1 instance)"
},
"metrics": {
"client_instances": 0,
"exit_code": 0,
"protocol": "h1",
"proxy_1_actual_rps": 4000.0,
"proxy_1_avg_backend_latency_ms": 180.095,
"proxy_1_avg_latency_ms": 180.0,
"proxy_1_requests_failed": 1,
"proxy_1_requests_received": 276720,
"proxy_1_requests_succeeded": 276719,
"proxy_1_retries_attempted": 0,
"proxy_1_retries_succeeded": 0,
"proxy_1_success_rate_pct": 100.0,
"proxy_actual_rps": 4000.0,
"proxy_instances": 1,
"proxy_requests_failed": 1,
"proxy_requests_received": 276720,
"proxy_requests_succeeded": 276719,
"proxy_success_rate_pct": 99.99963862387973
},
```
Reviewed By: YifanYuan3
Differential Revision: D100658456
|
@SamirFarhat17 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D100658456. |
meta-codesync bot
pushed a commit
that referenced
this pull request
Apr 13, 2026
Summary: Pull Request resolved: #577 Two fixes found while running multi-proxy CDN benchmarks on Gen10+/Gen11 edge hosts: **cdn_bench.py parser — multi-instance support:** - Section headers now match with startswith() instead of exact ==, so 'Client Results (instance 0, target ...)' is correctly detected - Per-instance metrics stored as client_1_requests_sent, proxy_2_actual_rps etc - Aggregate totals accumulated across instances (client_requests_sent = sum of all instances) - Computed aggregate success/error rates and instance counts ``` "metrics": { "client_instances": 0, "exit_code": 0, "protocol": "h1", "proxy_1_actual_rps": 4000.0, "proxy_1_avg_backend_latency_ms": 176.534, "proxy_1_avg_latency_ms": 177.0, "proxy_1_requests_failed": 0, "proxy_1_requests_received": 280356, "proxy_1_requests_succeeded": 280356, "proxy_1_retries_attempted": 0, "proxy_1_retries_succeeded": 0, "proxy_1_success_rate_pct": 100.0, "proxy_2_actual_rps": 4000.0, "proxy_2_avg_backend_latency_ms": 175.723, "proxy_2_avg_latency_ms": 176.0, "proxy_2_requests_failed": 0, "proxy_2_requests_received": 285273, "proxy_2_requests_succeeded": 285273, "proxy_2_retries_attempted": 0, "proxy_2_retries_succeeded": 0, "proxy_2_success_rate_pct": 100.0, "proxy_actual_rps": 8000.0, "proxy_instances": 2, "proxy_requests_failed": 0, "proxy_requests_received": 565629, "proxy_requests_succeeded": 565629, "proxy_success_rate_pct": 100.0 }, ``` **run.sh — scientific notation handling:** - Proxy metrics regex updated to match scientific notation (e.g. 5e+03 was parsed as just 5) - printf formatting for Success Rate (%.2f) and Actual RPS (%.1f) so 1e+02 displays as 100.00% ``` Proxy Results (instance 0, port 8081) Requests Received: 276720 Requests Succeeded: 276719 Requests Failed: 1 Success Rate: 100.00% Actual RPS: 4000.0 Avg Total Latency ms: 180 Avg Backend Latency ms: 180.095 Retries Attempted: 0 Retries Succeeded: 0 Proxy Role Complete Cleaning up processes... stderr: Results Report: { "benchmark_args": [ "-m proxy", "-B 2401:db00:f01b:301d:face:0:18f:0", "-b 8080", "-P 8081", "-p h1" ], "benchmark_desc": "Distributed CDN benchmark. Run server, proxy, and client roles on separate hosts. Supports multiple instances of each role for scaling.\n", "benchmark_hooks": [ "perf: {'perfstat': {'interval': 1}, 'mpstat': {'interval': 1}, 'netstat': {'interval': 1}}", "copymove: {'is_move': True, 'after': ['packages/cdn_bench/cdn_bench_run.log']}" ], "benchmark_name": "cdn_bench", "machines": [ { "cpu_architecture": "x86_64", "cpu_model": "INTEL(R) XEON(R) PLATINUM 8558P", "hostname": "fnedge932.01.ams2.facebook.com", "kernel_version": "6.4.3-0_fbk15_hardened_2630_gf27365f948db", "mem_total_kib": "525701232 KiB", "num_cpus_usable": 96, "num_logical_cpus": "96", "os_distro": "centos", "os_release_name": "CentOS Stream 9", "threads_per_core": "2" } ], "metadata": { "L1d cache": "2.3 MiB (48 instances)", "L1i cache": "1.5 MiB (48 instances)", "L2 cache": "96 MiB (48 instances)", "L3 cache": "260 MiB (1 instance)" }, "metrics": { "client_instances": 0, "exit_code": 0, "protocol": "h1", "proxy_1_actual_rps": 4000.0, "proxy_1_avg_backend_latency_ms": 180.095, "proxy_1_avg_latency_ms": 180.0, "proxy_1_requests_failed": 1, "proxy_1_requests_received": 276720, "proxy_1_requests_succeeded": 276719, "proxy_1_retries_attempted": 0, "proxy_1_retries_succeeded": 0, "proxy_1_success_rate_pct": 100.0, "proxy_actual_rps": 4000.0, "proxy_instances": 1, "proxy_requests_failed": 1, "proxy_requests_received": 276720, "proxy_requests_succeeded": 276719, "proxy_success_rate_pct": 99.99963862387973 }, ``` Reviewed By: YifanYuan3 Differential Revision: D100658456 fbshipit-source-id: 466c3b3392f7f8e03dab8095ea03739db04a6828
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
Two fixes found while running multi-proxy CDN benchmarks on Gen10+/Gen11 edge hosts:
cdn_bench.py parser — multi-instance support:
run.sh — scientific notation handling:
Reviewed By: YifanYuan3
Differential Revision: D100658456