feat: dynamic OpenAPI spec registry (replace bundled specs) by tvt286 · Pull Request #4 · vngcloud/greennode-mcp

tvt286 · 2026-04-18T03:53:46Z

Summary

MCP server now fetches OpenAPI specs from VNG Cloud's public docs portal at startup instead of bundling them in the wheel. New products published on docs.api.vngcloud.vn appear automatically on the user's next server restart — no code release needed.

Architecture

Provider pattern so the source can be swapped with a one-line change:

SpecProvider (Protocol)
  ├── RedoclyPortalProvider  ← active (scrapes docs portal)
  ├── LocalDirProvider       (dev/test via GRN_MCP_SPEC_DIR)
  ├── JsonRegistryProvider   (future — S3)
  └── OciRegistryProvider    (future — vCR)

Cache at ~/.greenode/mcp-specs/ with TTL + HTTP conditional GET. Partial failure tolerant — 1 product failing doesn't take down the whole server.

Breaking Changes

specs/vks.json no longer bundled in wheel
First run of v0.4.0 on a new machine requires network to docs.api.vngcloud.vn
Rollback: uvx greenode-mcp-server@0.3.2

New CLI flags

--refresh-specs — force re-download from registry
--offline — skip registry fetch, use cache only

Build-time URL config

Release workflow bakes DEFAULT_DOCS_PORTAL_URL at build time via GitHub Actions var DOCS_PORTAL_URL. End users cannot override at runtime.

Test Plan

124 unit tests passing
Ruff clean across entire codebase
Wheel contains registry/ + _build_info.py, does NOT contain specs/
CLI shows --refresh-specs and --offline flags
GRN_MCP_SPEC_DIR activates LocalDirProvider
Smoke test vs real portal — 906 endpoints loaded across VKS, vServer, vLB, vDB, vMonitor, etc.
Partial failure tolerance verified (1 bad product skipped, rest loaded)
Manual: uvx greenode-mcp-server --help in Claude Code
Manual: search_api / call_api work end-to-end with real IAM credentials

Docs Updated

src/greenode-mcp-server/README.md — Spec Registry section + troubleshooting
src/greenode-mcp-server/CHANGELOG.md — v0.4.0 entry with breaking changes
README.md (root) — updated repository structure
CLAUDE.md — updated project overview, repo structure, key files table
.github/workflows/release.yml — "Bake docs portal URL" step

🤖 Generated with Claude Code

MCP server now fetches specs from VNG Cloud's public docs portal at startup and caches them under ~/.greenode/mcp-specs/ — new products appear automatically without any server release. - Add SpecProvider protocol + factory (registry/provider.py, factory.py) - Add RedoclyPortalProvider scraping docs.api.vngcloud.vn (~906 endpoints across VKS, vServer, vLB, vDB, vMonitor, and more) - Add LocalDirProvider for dev/test (GRN_MCP_SPEC_DIR env var) - Add SpecCache with TTL + HTTP conditional GET (registry/cache.py) - Add load_specs orchestrator with offline/refresh flags (registry/loader.py) - Integrate into api_index.py via initialize_index() - New CLI flags: --refresh-specs, --offline - Bake DEFAULT_DOCS_PORTAL_URL at build time via CI (release.yml) - Delete bundled specs/ directory - Update README, CHANGELOG, CLAUDE.md Breaking: first run requires network to docs.api.vngcloud.vn. Roll back with uvx greenode-mcp-server@0.3.2 if needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Cacheless providers (local-dir) now fetch directly without touching the on-disk cache. Previously running with GRN_MCP_SPEC_DIR would pollute ~/.greenode/mcp-specs/ with whatever was in the local dir, then the next production run would read that stale data back. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

VNG Cloud APIs wrap list payloads under various keys (listData, data, results, records) besides 'items'. Previously _format_response only recognized 'items' — other responses fell back to _format_object which dumps every field of every item, blowing LLM context on large lists (e.g. security groups returned as 11.9k-token raw JSON). Also cap list output at 30 rows with a footer suggesting pagination. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Previously search was strict AND-match — "flavor" in product=vks returned empty even though vServer has flavor endpoints. Now falls back: Tier 1: AND match, scoped by product Tier 2: AND match, all products (when scoped gives 0) Tier 3: OR match, scoped by product Tier 4: OR match, all products Also: - Simple stemming: "clusters" matches "cluster" (trailing -s stripped for words >4 chars) - Relevance ranking: summary (+3) > path (+2) > description (+1) so strongest matches surface first - Entry.format() now prefixes [product] so AI sees which product each result belongs to — important when fallback returns cross-product matches Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Spawns greenode-mcp-server as stdio subprocess and verifies the basic MCP handshake (initialize → tools/list → tools/call search_api) works end-to-end. Uses GRN_MCP_SPEC_DIR fixture to avoid docs portal dep. Catches regressions in: - Protocol-level serialization (tool schemas, JSON-RPC framing) - Tool registration (all 8 tools must be exposed) - Startup sequence (initialize_index must complete before tools/call) Runs after unit tests + ruff in the PR workflow. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Previously `{"status": "DELETING"}` rendered as `**status**: DELETING` — markdown bold on a lone key is noisy. Now single-field dicts render as plain `status: DELETING`. Multi-field responses keep bold for key emphasis. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

### Bug 1: "Invalid kube-config: expected key current-context" VKS GET /v1/clusters/{id}/kubeconfig returns a JSON object ClusterKubeConfigDto: {"kubeConfig": "<yaml string>", "status": "ACTIVE", ...} k8s_client_cache was calling get_raw() and treating the whole JSON as the kubeconfig YAML. yaml.safe_load happily parsed it (JSON is valid YAML) into a dict with keys like 'kubeConfig', 'status', 'expirationAt' — then the kubernetes library choked on the missing 'current-context'. Fix: use get() to parse JSON, check status, extract kubeConfig field, yaml.safe_load THAT string. Clear errors for NONE/CREATING/ERROR so the caller knows to request a kubeconfig first. ### Bug 2: list_k8s_resources / manage_k8s_resource require api_version For common built-in kinds (Pod, Deployment, Service, PVC, ...) the api_version is well-known. Requiring users to guess it blocks the happy path. Now api_version is optional: if unset, look up the kind in a built-in COMMON_API_VERSIONS map (31 kinds). Custom resources still need explicit api_version. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Previously list_k8s_resources returned only name/namespace/labels/ annotations — AI had to call manage_k8s_resource per-item to see if pods were Running, PVCs Bound, deployments rolled out, etc. Slow and token-heavy for large namespaces. Now each summary carries a compact status_summary string: - Pod: "Running (ready 2/2, restarts 0)" - Deployment/StatefulSet/ReplicaSet: "3/3 ready" - DaemonSet: "3/3 ready" - Service: "LoadBalancer 10.0.0.1 → 1.2.3.4" - PersistentVolumeClaim: "Bound (10Gi)" - Node: "Ready (v1.29.0)" - Job: "active=0 succeeded=1 failed=0" - Ingress: "1.2.3.4" or "no address" - Unknown kind: falls back to status.phase Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…vents Per the design spec, only Secret reads should require the sensitive-data flag. get_pod_logs and get_k8s_events were scope creeped onto the same gate, blocking common debug workflows like "what's in the logs?" and "why is this pod pending?". - Pod logs and events are routine debug reads, similar to listing resources — no stricter guard than list_k8s_resources itself. - Apps should not log secrets; if they do, that's an app-layer bug. - Secrets remain guarded (manage_k8s_resource read on kind=Secret). Docs (README, CLAUDE.md, --allow-sensitive-data-access help text) already scoped the flag to Secrets — no doc changes needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…veat Output previously showed `namespace: null` when listing cross-namespace, which can read as "default namespace" or "unknown". Now shows `namespace: "all namespaces"` explicitly when scope is cluster-wide. Added docstring notes: - Explicit hint that leaving namespace empty lists all namespaces - Warning that `status.phase != Running` misses CrashLoopBackOff pods (they keep phase=Running). Point AI at status_summary for reliable unhealthy-pod detection. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Previously apply_yaml required yaml_path (absolute server-local path) and a namespace. That breaks when: - MCP client and server run on different machines (common case — user's YAML isn't on the server) - The manifest already declares its own namespaces (no need to override) Now: - yaml_content (inline string) or yaml_path — pick one - namespace optional; defaults to "default" for resources without one Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

VKS previously used 0-based pagination (page=0 = first page). VNG Cloud team is standardizing to 1-based across all products so VKS will be updated to match. - CLAUDE.md: drop VKS-specific 0-based note; state the cross-product 1-based convention - call_api tool description: add pagination hint so the AI uses page=1 and knows what to do when the API returns 400 Page/size invalid Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

AI queries often use cloud-generic terms (VPC, instance, firewall) but VNG Cloud specs use product-specific names (network, server, secgroup). Stemming only handled plurals — synonyms need explicit mapping. Added small VNG-specific synonym map: - vpc → network - instance → server - firewall → secgroup, security - pvc → persistentvolumeclaim - k8s → kubernetes - lb ↔ loadbalancer AND semantics preserved: each query term must match some variant of itself; synonyms expand what counts as "match" for a single term, not what counts as a separate term. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ders VNG Cloud APIs embed the account's project UUID in many URL paths. AI previously had to call /v1/projects manually, extract projectId, then substitute — adding an extra round-trip to every workflow. New ProjectContext lazily fetches and caches the first project from vServer /v1/projects on startup. call_api now swaps {projectId} and {project_id} in paths transparently before making the request. - project_context.py: async, thread-safe, in-memory cache - api_caller.call_api: substitutes placeholder when project_context is provided; falls through unchanged otherwise - server.py: wires a shared ProjectContext into call_api_tool - Tool description tells the AI to leave placeholders in the path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

greenode-cli v1.3.x now saves project_id per profile in ~/.greenode/config and supports GRN_DEFAULT_PROJECT_ID env override. MCP reads from the same source, so users who ran 'grn configure' get zero-latency project_id resolution — no vServer /v1/projects call at all. Resolution order: 1. GRN_DEFAULT_PROJECT_ID env var 2. ~/.greenode/config [profile] project_id field 3. ProjectContext API fetch (existing fallback) 4. Clear error directing user to 'grn configure' Also fix a latent bug: config.py was reading non-default profiles from section "[<name>]", but greenode-cli writes them as "[profile <name>]" per AWS convention. Now aligned. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ProjectContext was added to auto-fetch project_id from vServer /v1/projects when not configured. Now that greenode-cli v1.3.x writes project_id to ~/.greenode/config (and supports GRN_DEFAULT_PROJECT_ID env override), the API fallback is redundant with the same logic in the CLI wizard. Simpler is better: - One source of truth (config) instead of two - Clear error message directing users to `grn configure` beats silent background fetching — users learn the setup step faster - Removes ~280 LoC (module + tests) - Capability layer stays thin Deleted: - greennode/greenode_mcp_server/project_context.py - tests/test_project_context.py Modified: - api_caller.call_api: drop project_context param, error when config lacks project_id - server.py: remove ProjectContext wiring - call_api tool description: reflect single-source behavior Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The MCP server covers all GreenNode products, not just VKS. The config type shouldn't imply a single product. Mirror the earlier rename of VksClient to GreenodeClient. - config.py: class name + "VKS configuration" docstrings → "GreenNode configuration" - RegionEndpoints: "Endpoints for a single VKS region" → "Service endpoints for a single GreenNode region" - auth.py, client.py, api_caller.py, k8s_handler.py: import + type annotations updated No behavior change. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Three tweaks mirroring patterns in Cloudflare's production MCP servers: 1. raw=True: when set, call_api returns the full JSON response instead of the 6-column markdown table. AI uses this when it needs fields the table hides (e.g. subnet counts, tag maps) or wants to transform the data itself. 2. MAX_RESPONSE_BYTES = 800,000: matches Cloudflare's graphql tool guard. Responses larger than this return an actionable error asking the caller to paginate rather than silently truncating. 3. MAX_LIST_ROWS 30 → 100: matches Cloudflare's logpush default. Large-enough to cover most list operations without requiring pagination, small-enough to stay well inside the size cap. Deliberately NOT added (after reviewing Cloudflare's codebase): - fields / projection param (no Cloudflare tool has one) - response caching (Cloudflare relies on backend/CDN) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Claude Code and other LLM clients sometimes shorten long IDs with ellipsis (e.g. 'net-05934e2d...') when re-rendering tool output into tables for readability. That makes the value unusable — the user can't copy/paste it into the next call. SERVER_INSTRUCTIONS now explicitly tells the LLM: never truncate IDs, UUIDs, names, or certificate data. Prefer vertical key/value layout if the table gets wide. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Full audit pass — every doc in the repo touched now reflects current behavior, removed stale bits: - docs/DEVELOPMENT.md: replaced every `vks-mcp-server` reference (was copy-pasted from the old VKS-only era), added --refresh-specs / --offline flags, GRN_DEFAULT_PROJECT_ID env var, spec registry section, build-time URL bake explanation, DOCS_PORTAL_URL GitHub variable - src/greenode-mcp-server/README.md: documented call_api `raw=True` param, path placeholder resolution, full parameter table, profile behavior; dropped "pip install grncli" (CLI is Go now) - README.md (root): GRN_DEFAULT_PROJECT_ID in credential setup; corrected sensitive-data claim (only Secret reads, not pod logs/events) - CLAUDE.md: added API quirks (list wrapper keys, placeholders, COMMON_API_VERSIONS, kubeconfig envelope), expanded security rules (response size cap, row cap, pod logs/events ungated), updated test count (~175), linked MCP protocol smoke script - CHANGELOG.md v0.4.0: fleshed out the stub section with every shipped feature + fix + breaking change Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ring sanitization' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

tytv2 and others added 8 commits April 18, 2026 10:53

github-advanced-security AI found potential problems Apr 18, 2026

View reviewed changes

Comment thread src/greenode-mcp-server/tests/test_status_summary.py Fixed

tytv2 and others added 13 commits April 18, 2026 15:56

Potential fix for pull request finding 'CodeQL / Incomplete URL subst…

0663b1a

…ring sanitization' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

tvt286 merged commit a97856e into main Apr 19, 2026
2 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: dynamic OpenAPI spec registry (replace bundled specs)#4

feat: dynamic OpenAPI spec registry (replace bundled specs)#4
tvt286 merged 21 commits intomainfrom
feat/dynamic-spec-registry

tvt286 commented Apr 18, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tvt286 commented Apr 18, 2026

Summary

Architecture

Breaking Changes

New CLI flags

Build-time URL config

Test Plan

Docs Updated

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants