fix(prometheus): handle /api/v1/metadata responses without data field#5437
Open
lezzago wants to merge 1 commit into
Open
fix(prometheus): handle /api/v1/metadata responses without data field#5437lezzago wants to merge 1 commit into
lezzago wants to merge 1 commit into
Conversation
9d25ff2 to
7848bd8
Compare
joshuali925
previously approved these changes
May 12, 2026
| ## [Unreleased] | ||
|
|
||
| ### Fixed | ||
| - Fix PrometheusClientImpl to handle Prometheus responses missing the `data` field ([#5437](https://github.com/opensearch-project/sql/pull/5437)) |
ps48
previously approved these changes
May 12, 2026
Prometheus-compatible backends (notably Cortex) legitimately return
{"status":"success"} with no "data" key when there is no metric
metadata to report — which is always the case for OTLP-ingested
metrics, since OTLP carries no OpenMetrics HELP/TYPE comments.
PrometheusClientImpl.getAllMetrics(), queryRange(), query(), getLabels(),
getLabel(), getSeries(), queryExemplars(), and getAlerts() blindly
called response.getJSONObject("data")/getJSONArray("data"), throwing
JSONException and breaking OSD's Metrics Explore UI end-to-end for
Cortex-backed Prometheus datasources.
Guard each call site with a response.has("data") check, returning an
empty result for the no-metadata case.
Signed-off-by: Ashish Agrawal <ashisagr@amazon.com>
7848bd8 to
cdddbb7
Compare
joshuali925
approved these changes
May 12, 2026
ps48
approved these changes
May 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
PrometheusClientImplblindly calledresponse.getJSONObject("data")/response.getJSONArray("data")on every Prometheus API response. The Prometheus HTTP API spec does not mandate thatdatabe present, and Cortex (observed on v1.18.1) legitimately returns{"status":"success"}with nodatakey when the endpoint has nothing to report.This is the default state for any Cortex backend fed via OTLP remote-write, since OTLP metrics carry no OpenMetrics
# HELP/# TYPEcomments for Cortex to surface. A vanilla Prometheus returns{"status":"success","data":{}}in the same state, so the spec allows both shapes.The resulting
JSONExceptionpropagates up throughPrometheusQueryHandler.getResources()→DirectQueryExecutorServiceImpland surfaces in OpenSearch Dashboards' Metrics Explore UI as "Unable to load metrics" for every metric against any Cortex-backed Prometheus datasource.Stack trace observed on the live EKS deployment:
Fix
Guard each affected call site (
getAllMetrics,queryRange,query,getLabels,getLabel,getSeries,queryExemplars,getAlerts) with aresponse.has("data")check, returning a type-appropriate empty value (new JSONObject(),new JSONArray(),new ArrayList<>(),new HashMap<>()) instead of throwing.Mutable empty collections are intentional: the transport layer reflectively constructs response maps, and
Collections.emptyMap()/Collections.emptyList()triggerInaccessibleObjectException("module java.base does not 'opens java.util' to unnamed module") during XContent serialization.getRulesalready routes throughnormalizeRulesResponse(), which handles the missing-datacase, so no change was needed there. Alertmanager methods don't read adatafield.Testing
./gradlew :direct-query-core:test— 254 tests, 0 failures (including 9 new tests — one per patched method — each feeding{"status":"success"}and asserting a safe empty return)../gradlew :prometheus:test— 104 tests, 0 failures.GET /_plugins/_directquery/_resources/<ds>/api/v1/metadata→ HTTP 200{"data":{}}(was 500).POST /api/enhancements/resourcesformetric_metadata→ 200{"data":{},"type":"prometheus"}(was 503).Issues resolved
N/A — filed directly.
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.