Skip to content

Add stream topic label to realtime ingestion delay metrics#18175

Open
rsrkpatwari1234 wants to merge 22 commits intoapache:masterfrom
rsrkpatwari1234:rsrkpatwari1234-issue-18099
Open

Add stream topic label to realtime ingestion delay metrics#18175
rsrkpatwari1234 wants to merge 22 commits intoapache:masterfrom
rsrkpatwari1234:rsrkpatwari1234-issue-18099

Conversation

@rsrkpatwari1234
Copy link
Copy Markdown
Contributor

@rsrkpatwari1234 rsrkpatwari1234 commented Apr 12, 2026

Problem

With multi-topic ingestion, ingestion delay gauges only keyed by table and partition group id made it hard to tell which Kafka (or other stream) topic was behind, often requiring indirect mapping from partition to topic in config (apache/pinot#18099).

Approach

Reuse the existing JMX → Prometheus naming pattern already used for other stream metrics:
tableNameWithType-topic-partitionGroupId, which matches server.yml rules so topic and partition are exported as labels.

Changes

  • AbstractMetrics: setOrUpdatePartitionGaugeForStreamTopic / removePartitionGaugeForStreamTopic build the same composite key as stream-topic meters and register via setOrUpdateTableGauge / removeTableGauge.
  • IngestionDelayTracker: updateMetrics takes an optional streamTopicName; if non-blank, delay gauges use the topic-aware helpers; if blank, behavior stays on the legacy per-partition gauge names (backward compatible).
  • IngestionInfo: stores the topic used for registration so removals/shutdown clear the correct MBeans.
  • RealtimeTableDataManager / RealtimeSegmentDataManager: pass _streamConfig.getTopicName() into ingestion metric updates (including “delay zero” paths).

Note : Series for REALTIME_INGESTION_DELAY_MS and END_TO_END_REALTIME_INGESTION_DELAY_MS will include a topic label when a non-blank topic is configured; dashboards or alerts that matched only the old metric shape may need updates.

Fixes #18099

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 12, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.33%. Comparing base (7e10a36) to head (7a87040).
⚠️ Report is 12 commits behind head on master.

Files with missing lines Patch % Lines
...e/data/manager/realtime/IngestionDelayTracker.java 21.05% 13 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18175      +/-   ##
============================================
+ Coverage     63.18%   63.33%   +0.15%     
- Complexity     1616     1627      +11     
============================================
  Files          3214     3229      +15     
  Lines        195838   196724     +886     
  Branches      30251    30411     +160     
============================================
+ Hits         123734   124590     +856     
+ Misses        62236    62151      -85     
- Partials       9868     9983     +115     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.27% <50.00%> (+0.15%) ⬆️
java-21 63.29% <50.00%> (+0.13%) ⬆️
temurin 63.33% <50.00%> (+0.15%) ⬆️
unittests 63.32% <50.00%> (+0.15%) ⬆️
unittests1 55.27% <43.33%> (-0.12%) ⬇️
unittests2 34.99% <26.66%> (+0.20%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@noob-se7en noob-se7en added ingestion Related to data ingestion pipeline metrics Related to metrics emission and collection real-time Related to realtime table ingestion and serving labels Apr 13, 2026
IngestionInfo ingestionInfo = _ingestionInfoMap.get(partitionId);
@Nullable String streamTopicName = ingestionInfo != null ? ingestionInfo._streamTopicName : null;
if (StringUtils.isNotBlank(streamTopicName)) {
_serverMetrics.setOrUpdatePartitionGaugeForStreamTopic(_metricName, streamTopicName, partitionId,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should look into adding the label in current existing metric itself instead of creating a new metric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ingestion Related to data ingestion pipeline metrics Related to metrics emission and collection real-time Related to realtime table ingestion and serving

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ingestion delay metrics should output a tag for the topic name

3 participants