fix(server-analytics): Prevent spiky underlay when stacking in small timeframes by balamurali27 · Pull Request #6602 · frappe/press

balamurali27 · 2026-06-03T03:29:50Z

fix(server-charts): Widen rate() window to stop spikes to zero
The CPU/network/IOPS and MariaDB rate charts spiked down to zero on
arbitrary steps. The rate()/increase() lookback window was set to the
chart step (timegrain), which auto_timespan_timegrain can size below 2x
the 60s Prometheus scrape interval. When the window spans fewer than two
samples Prometheus returns no value for that step, and the chart renders
the gap as a spike to zero.

Decouple the rate window from the step with get_rate_interval(), which
mirrors Grafana's $__rate_interval and always spans several scrapes.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

test(server-charts): Cover rate-window sizing for analytics queries
Tests for get_rate_interval and the server analytics query builder: the
rate window always spans at least two scrape intervals, the CPU query
uses the widened window for a one-hour range, no rate chart uses a
sub-scrape window, and the Prometheus series has no gaps (missing steps
become None, not zero).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

test(server-charts): Add Playwright e2e tests for server charts
Mock the analytics API and assert the CPU chart renders its line paths
with non-spiky data, requests analytics with a valid time range, and
shows the empty state when there is no data.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

chore(gitignore): Ignore .claude directory
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

The CPU/network/IOPS and MariaDB rate charts spiked down to zero on arbitrary steps. The rate()/increase() lookback window was set to the chart step (timegrain), which auto_timespan_timegrain can size below 2x the 60s Prometheus scrape interval. When the window spans fewer than two samples Prometheus returns no value for that step, and the chart renders the gap as a spike to zero. Decouple the rate window from the step with get_rate_interval(), which mirrors Grafana's $__rate_interval and always spans several scrapes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Tests for get_rate_interval and the server analytics query builder: the rate window always spans at least two scrape intervals, the CPU query uses the widened window for a one-hour range, no rate chart uses a sub-scrape window, and the Prometheus series has no gaps (missing steps become None, not zero). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Mock the analytics API and assert the CPU chart renders its line paths with non-spiky data, requests analytics with a valid time range, and shows the empty state when there is no data. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

greptile-apps · 2026-06-03T03:33:17Z

Greptile Summary

This PR fixes CPU/network/IOPS/MariaDB charts that spiked down to zero on arbitrary steps by decoupling the Prometheus rate()/increase() lookback window from the chart step. Previously, when auto_timespan_timegrain produced a step below 2× the 60 s scrape interval (e.g. 90 s for a 1-hour view), Prometheus could see only one sample in the window and return nothing, which the chart rendered as a zero spike.

get_rate_interval() in analytics.py mirrors Grafana's $__rate_interval: max(timegrain + scrape_interval, 4 × scrape_interval), ensuring the window always spans several scrapes regardless of the chart step.
All rate()/increase() queries in server.py (cpu, network, iops, database_commands_count, innodb_bp_miss_percent, innodb_avg_row_lock_time) are updated to use rate_interval instead of timegrain.
A comprehensive unit test suite (test_analytics.py) and Playwright e2e tests (server-charts.test.ts) verify the window sizing, query correctness, and chart rendering.

Confidence Score: 5/5

Safe to merge — the change is a targeted, well-tested decoupling of the Prometheus rate window from the chart step, with no schema, permission, or data-path regressions.

The get_rate_interval formula exactly mirrors Grafana's $__rate_interval and is verified by unit tests against every value auto_timespan_timegrain can produce. All affected PromQL queries are updated consistently, non-rate queries are untouched, and the Playwright e2e tests add end-to-end coverage. No new security surface, no ORM changes, no migrations.

No files require special attention.

Important Files Changed

Filename	Overview
press/api/analytics.py	Adds get_rate_interval() that mirrors Grafana's $__rate_interval: max(timegrain + scrape_interval, 4 × scrape_interval), plus a PROMETHEUS_SCRAPE_INTERVAL constant. Formula is correct; Final is already imported; zero-timegrain guard is present.
press/api/server.py	Replaces timegrain with rate_interval in all rate()/increase() PromQL queries (cpu, network, iops, database_commands_count, innodb_bp_miss_percent, innodb_avg_row_lock_time). Import updated accordingly; no unintended side-effects on non-rate queries.
press/api/tests/test_analytics.py	New test module covering get_rate_interval(), auto_timespan_timegrain(), per-chart query capture, and Prometheus series alignment. Window-size assertion and gap-as-None tests are both correct.
dashboard/tests-e2e/tests/dashboard/server-charts.test.ts	Playwright e2e tests: chart renders with data, start/end params are sent, empty-state shows on no data. Imports coverage.fixture which exists; mock helpers are well-structured.
dashboard/tests-e2e/mocks/server-charts/analytics.ts	Mock data helpers for e2e tests; cpuDataWithGaps is exported but never imported in the test file (dead export flagged in a previous review thread).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["analytics() endpoint\n(start, end, query)"] --> B["auto_timespan_timegrain()\n→ timegrain (chart step)"]
    B --> C["get_rate_interval(timegrain)\nmax(timegrain + 60s, 4 × 60s)"]
    C --> D{"timegrain ≤ 180 s?\n(e.g. 90 s for 1 h view)"}
    D -- "Yes (min floor kicks in)" --> E["rate_interval = 240 s\n(4 × scrape_interval)"]
    D -- "No (step is coarse)" --> F["rate_interval = timegrain + 60 s"]
    E --> G["PromQL: rate(...[240s])\nAlways ≥ 2 scrapes → no gaps"]
    F --> G
    G --> H["prometheus_query()\nstep = timegrain (chart resolution unchanged)"]
    H --> I["Chart renders without\nspikes to zero"]

_{Reviews (3): Last reviewed commit: "fix(site): Correct grammar" | Re-trigger Greptile}

greptile-apps · 2026-06-03T03:33:21Z

+	const data = cpuData()
+	for (const dataset of data.datasets) {
+		dataset.values = dataset.values.map((value, i) =>
+			i % 2 === 1 ? null : value,
+		)
+	}
+	return data
+}
+
+export function emptyData() {
+	return { datasets: [], labels: [] }
+}


Unused export cpuDataWithGaps

cpuDataWithGaps is exported but never imported or used in server-charts.test.ts. The comment says it represents the pre-fix shape ("spikes to zero") and was presumably intended to anchor a regression test that would assert charts with null gaps don't render zeros, but that test was never written. The export is dead code in its current state.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-06-03T03:33:22Z

+	def test_rate_charts_never_use_a_sub_scrape_window(self):
+		start = datetime(2024, 1, 1, 0, 0, 0)
+		for chart in ("cpu", "network", "iops", "database_commands_count"):
+			query = self._capture_query(start, start + timedelta(minutes=10), chart)
+			window = int(query.split("[")[1].split("s]")[0])
+			self.assertGreaterEqual(
+				window,
+				2 * PROMETHEUS_SCRAPE_INTERVAL,
+				f"{chart} chart queried a {window}s window",
+			)


InnoDB rate charts are not covered by the window-size assertion

test_rate_charts_never_use_a_sub_scrape_window checks cpu, network, iops, and database_commands_count, but innodb_bp_miss_percent and innodb_avg_row_lock_time also use rate_interval in their queries and are not included. A future regression that accidentally re-introduces a narrow window on those queries would not be caught here.

codecov-commenter · 2026-06-03T03:37:22Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 50.84%. Comparing base (af89b0c) to head (88ede89).
⚠️ Report is 57 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #6602      +/-   ##
===========================================
+ Coverage    49.43%   50.84%   +1.41%     
===========================================
  Files          976      990      +14     
  Lines        81257    84133    +2876     
  Branches       379      701     +322     
===========================================
+ Hits         40166    42778    +2612     
- Misses       41063    41321     +258     
- Partials        28       34       +6

Flag	Coverage Δ
dashboard	`64.17% <ø> (+3.86%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Make the helper's call sites say what they run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

greptile-apps · 2026-06-03T03:44:07Z

Want your agent to iterate on Greptile's feedback? Try greploops.

balamurali27 and others added 4 commits June 2, 2026 23:28

chore(gitignore): Ignore .claude directory

8180506

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

balamurali27 requested review from siduck and ssiyad as code owners June 3, 2026 03:29

balamurali27 changed the title ~~chart fix~~ fix(server-analytics): Prevent spiky underlay when stacking in small timeframes Jun 3, 2026

greptile-apps Bot reviewed Jun 3, 2026

View reviewed changes

test(server-charts): Rename _run helper to _run_prometheus_query

1f20e7a

Make the helper's call sites say what they run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

fix(site): Correct grammar

88ede89

balamurali27 requested a review from tanmoysrt as a code owner June 3, 2026 05:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(server-analytics): Prevent spiky underlay when stacking in small timeframes#6602

fix(server-analytics): Prevent spiky underlay when stacking in small timeframes#6602
balamurali27 wants to merge 6 commits into
developfrom
chart-fix

balamurali27 commented Jun 3, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 3, 2026 •

edited

Loading

Greptile Summary

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Jun 3, 2026

Uh oh!

greptile-apps Bot Jun 3, 2026

Uh oh!

codecov-commenter commented Jun 3, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

balamurali27 commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

balamurali27 commented Jun 3, 2026 •

edited

Loading

greptile-apps Bot commented Jun 3, 2026 •

edited

Loading

codecov-commenter commented Jun 3, 2026 •

edited

Loading