feat: dynamic OpenAPI spec registry (replace bundled specs)#4
Merged
Conversation
MCP server now fetches specs from VNG Cloud's public docs portal at startup and caches them under ~/.greenode/mcp-specs/ — new products appear automatically without any server release. - Add SpecProvider protocol + factory (registry/provider.py, factory.py) - Add RedoclyPortalProvider scraping docs.api.vngcloud.vn (~906 endpoints across VKS, vServer, vLB, vDB, vMonitor, and more) - Add LocalDirProvider for dev/test (GRN_MCP_SPEC_DIR env var) - Add SpecCache with TTL + HTTP conditional GET (registry/cache.py) - Add load_specs orchestrator with offline/refresh flags (registry/loader.py) - Integrate into api_index.py via initialize_index() - New CLI flags: --refresh-specs, --offline - Bake DEFAULT_DOCS_PORTAL_URL at build time via CI (release.yml) - Delete bundled specs/ directory - Update README, CHANGELOG, CLAUDE.md Breaking: first run requires network to docs.api.vngcloud.vn. Roll back with uvx greenode-mcp-server@0.3.2 if needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cacheless providers (local-dir) now fetch directly without touching the on-disk cache. Previously running with GRN_MCP_SPEC_DIR would pollute ~/.greenode/mcp-specs/ with whatever was in the local dir, then the next production run would read that stale data back. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
VNG Cloud APIs wrap list payloads under various keys (listData, data, results, records) besides 'items'. Previously _format_response only recognized 'items' — other responses fell back to _format_object which dumps every field of every item, blowing LLM context on large lists (e.g. security groups returned as 11.9k-token raw JSON). Also cap list output at 30 rows with a footer suggesting pagination. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously search was strict AND-match — "flavor" in product=vks returned empty even though vServer has flavor endpoints. Now falls back: Tier 1: AND match, scoped by product Tier 2: AND match, all products (when scoped gives 0) Tier 3: OR match, scoped by product Tier 4: OR match, all products Also: - Simple stemming: "clusters" matches "cluster" (trailing -s stripped for words >4 chars) - Relevance ranking: summary (+3) > path (+2) > description (+1) so strongest matches surface first - Entry.format() now prefixes [product] so AI sees which product each result belongs to — important when fallback returns cross-product matches Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Spawns greenode-mcp-server as stdio subprocess and verifies the basic MCP handshake (initialize → tools/list → tools/call search_api) works end-to-end. Uses GRN_MCP_SPEC_DIR fixture to avoid docs portal dep. Catches regressions in: - Protocol-level serialization (tool schemas, JSON-RPC framing) - Tool registration (all 8 tools must be exposed) - Startup sequence (initialize_index must complete before tools/call) Runs after unit tests + ruff in the PR workflow. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously `{"status": "DELETING"}` rendered as `**status**: DELETING`
— markdown bold on a lone key is noisy. Now single-field dicts render
as plain `status: DELETING`. Multi-field responses keep bold for key
emphasis.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
### Bug 1: "Invalid kube-config: expected key current-context"
VKS GET /v1/clusters/{id}/kubeconfig returns a JSON object
ClusterKubeConfigDto:
{"kubeConfig": "<yaml string>", "status": "ACTIVE", ...}
k8s_client_cache was calling get_raw() and treating the whole JSON as
the kubeconfig YAML. yaml.safe_load happily parsed it (JSON is valid
YAML) into a dict with keys like 'kubeConfig', 'status', 'expirationAt'
— then the kubernetes library choked on the missing 'current-context'.
Fix: use get() to parse JSON, check status, extract kubeConfig field,
yaml.safe_load THAT string. Clear errors for NONE/CREATING/ERROR so the
caller knows to request a kubeconfig first.
### Bug 2: list_k8s_resources / manage_k8s_resource require api_version
For common built-in kinds (Pod, Deployment, Service, PVC, ...) the
api_version is well-known. Requiring users to guess it blocks the
happy path. Now api_version is optional: if unset, look up the kind
in a built-in COMMON_API_VERSIONS map (31 kinds). Custom resources
still need explicit api_version.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously list_k8s_resources returned only name/namespace/labels/ annotations — AI had to call manage_k8s_resource per-item to see if pods were Running, PVCs Bound, deployments rolled out, etc. Slow and token-heavy for large namespaces. Now each summary carries a compact status_summary string: - Pod: "Running (ready 2/2, restarts 0)" - Deployment/StatefulSet/ReplicaSet: "3/3 ready" - DaemonSet: "3/3 ready" - Service: "LoadBalancer 10.0.0.1 → 1.2.3.4" - PersistentVolumeClaim: "Bound (10Gi)" - Node: "Ready (v1.29.0)" - Job: "active=0 succeeded=1 failed=0" - Ingress: "1.2.3.4" or "no address" - Unknown kind: falls back to status.phase Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…vents Per the design spec, only Secret reads should require the sensitive-data flag. get_pod_logs and get_k8s_events were scope creeped onto the same gate, blocking common debug workflows like "what's in the logs?" and "why is this pod pending?". - Pod logs and events are routine debug reads, similar to listing resources — no stricter guard than list_k8s_resources itself. - Apps should not log secrets; if they do, that's an app-layer bug. - Secrets remain guarded (manage_k8s_resource read on kind=Secret). Docs (README, CLAUDE.md, --allow-sensitive-data-access help text) already scoped the flag to Secrets — no doc changes needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…veat Output previously showed `namespace: null` when listing cross-namespace, which can read as "default namespace" or "unknown". Now shows `namespace: "all namespaces"` explicitly when scope is cluster-wide. Added docstring notes: - Explicit hint that leaving namespace empty lists all namespaces - Warning that `status.phase != Running` misses CrashLoopBackOff pods (they keep phase=Running). Point AI at status_summary for reliable unhealthy-pod detection. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously apply_yaml required yaml_path (absolute server-local path) and a namespace. That breaks when: - MCP client and server run on different machines (common case — user's YAML isn't on the server) - The manifest already declares its own namespaces (no need to override) Now: - yaml_content (inline string) or yaml_path — pick one - namespace optional; defaults to "default" for resources without one Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
VKS previously used 0-based pagination (page=0 = first page). VNG Cloud team is standardizing to 1-based across all products so VKS will be updated to match. - CLAUDE.md: drop VKS-specific 0-based note; state the cross-product 1-based convention - call_api tool description: add pagination hint so the AI uses page=1 and knows what to do when the API returns 400 Page/size invalid Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AI queries often use cloud-generic terms (VPC, instance, firewall) but VNG Cloud specs use product-specific names (network, server, secgroup). Stemming only handled plurals — synonyms need explicit mapping. Added small VNG-specific synonym map: - vpc → network - instance → server - firewall → secgroup, security - pvc → persistentvolumeclaim - k8s → kubernetes - lb ↔ loadbalancer AND semantics preserved: each query term must match some variant of itself; synonyms expand what counts as "match" for a single term, not what counts as a separate term. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ders
VNG Cloud APIs embed the account's project UUID in many URL paths. AI
previously had to call /v1/projects manually, extract projectId, then
substitute — adding an extra round-trip to every workflow.
New ProjectContext lazily fetches and caches the first project from
vServer /v1/projects on startup. call_api now swaps {projectId} and
{project_id} in paths transparently before making the request.
- project_context.py: async, thread-safe, in-memory cache
- api_caller.call_api: substitutes placeholder when project_context is
provided; falls through unchanged otherwise
- server.py: wires a shared ProjectContext into call_api_tool
- Tool description tells the AI to leave placeholders in the path
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
greenode-cli v1.3.x now saves project_id per profile in ~/.greenode/config and supports GRN_DEFAULT_PROJECT_ID env override. MCP reads from the same source, so users who ran 'grn configure' get zero-latency project_id resolution — no vServer /v1/projects call at all. Resolution order: 1. GRN_DEFAULT_PROJECT_ID env var 2. ~/.greenode/config [profile] project_id field 3. ProjectContext API fetch (existing fallback) 4. Clear error directing user to 'grn configure' Also fix a latent bug: config.py was reading non-default profiles from section "[<name>]", but greenode-cli writes them as "[profile <name>]" per AWS convention. Now aligned. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ProjectContext was added to auto-fetch project_id from vServer /v1/projects when not configured. Now that greenode-cli v1.3.x writes project_id to ~/.greenode/config (and supports GRN_DEFAULT_PROJECT_ID env override), the API fallback is redundant with the same logic in the CLI wizard. Simpler is better: - One source of truth (config) instead of two - Clear error message directing users to `grn configure` beats silent background fetching — users learn the setup step faster - Removes ~280 LoC (module + tests) - Capability layer stays thin Deleted: - greennode/greenode_mcp_server/project_context.py - tests/test_project_context.py Modified: - api_caller.call_api: drop project_context param, error when config lacks project_id - server.py: remove ProjectContext wiring - call_api tool description: reflect single-source behavior Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The MCP server covers all GreenNode products, not just VKS. The config type shouldn't imply a single product. Mirror the earlier rename of VksClient to GreenodeClient. - config.py: class name + "VKS configuration" docstrings → "GreenNode configuration" - RegionEndpoints: "Endpoints for a single VKS region" → "Service endpoints for a single GreenNode region" - auth.py, client.py, api_caller.py, k8s_handler.py: import + type annotations updated No behavior change. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three tweaks mirroring patterns in Cloudflare's production MCP servers: 1. raw=True: when set, call_api returns the full JSON response instead of the 6-column markdown table. AI uses this when it needs fields the table hides (e.g. subnet counts, tag maps) or wants to transform the data itself. 2. MAX_RESPONSE_BYTES = 800,000: matches Cloudflare's graphql tool guard. Responses larger than this return an actionable error asking the caller to paginate rather than silently truncating. 3. MAX_LIST_ROWS 30 → 100: matches Cloudflare's logpush default. Large-enough to cover most list operations without requiring pagination, small-enough to stay well inside the size cap. Deliberately NOT added (after reviewing Cloudflare's codebase): - fields / projection param (no Cloudflare tool has one) - response caching (Cloudflare relies on backend/CDN) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude Code and other LLM clients sometimes shorten long IDs with ellipsis (e.g. 'net-05934e2d...') when re-rendering tool output into tables for readability. That makes the value unusable — the user can't copy/paste it into the next call. SERVER_INSTRUCTIONS now explicitly tells the LLM: never truncate IDs, UUIDs, names, or certificate data. Prefer vertical key/value layout if the table gets wide. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Full audit pass — every doc in the repo touched now reflects current behavior, removed stale bits: - docs/DEVELOPMENT.md: replaced every `vks-mcp-server` reference (was copy-pasted from the old VKS-only era), added --refresh-specs / --offline flags, GRN_DEFAULT_PROJECT_ID env var, spec registry section, build-time URL bake explanation, DOCS_PORTAL_URL GitHub variable - src/greenode-mcp-server/README.md: documented call_api `raw=True` param, path placeholder resolution, full parameter table, profile behavior; dropped "pip install grncli" (CLI is Go now) - README.md (root): GRN_DEFAULT_PROJECT_ID in credential setup; corrected sensitive-data claim (only Secret reads, not pod logs/events) - CLAUDE.md: added API quirks (list wrapper keys, placeholders, COMMON_API_VERSIONS, kubeconfig envelope), expanded security rules (response size cap, row cap, pod logs/events ungated), updated test count (~175), linked MCP protocol smoke script - CHANGELOG.md v0.4.0: fleshed out the stub section with every shipped feature + fix + breaking change Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ring sanitization' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MCP server now fetches OpenAPI specs from VNG Cloud's public docs portal at startup instead of bundling them in the wheel. New products published on
docs.api.vngcloud.vnappear automatically on the user's next server restart — no code release needed.Architecture
Provider pattern so the source can be swapped with a one-line change:
Cache at
~/.greenode/mcp-specs/with TTL + HTTP conditional GET. Partial failure tolerant — 1 product failing doesn't take down the whole server.Breaking Changes
specs/vks.jsonno longer bundled in wheeldocs.api.vngcloud.vnuvx greenode-mcp-server@0.3.2New CLI flags
--refresh-specs— force re-download from registry--offline— skip registry fetch, use cache onlyBuild-time URL config
Release workflow bakes
DEFAULT_DOCS_PORTAL_URLat build time via GitHub Actions varDOCS_PORTAL_URL. End users cannot override at runtime.Test Plan
registry/+_build_info.py, does NOT containspecs/--refresh-specsand--offlineflagsGRN_MCP_SPEC_DIRactivates LocalDirProvideruvx greenode-mcp-server --helpin Claude CodeDocs Updated
src/greenode-mcp-server/README.md— Spec Registry section + troubleshootingsrc/greenode-mcp-server/CHANGELOG.md— v0.4.0 entry with breaking changesREADME.md(root) — updated repository structureCLAUDE.md— updated project overview, repo structure, key files table.github/workflows/release.yml— "Bake docs portal URL" step🤖 Generated with Claude Code