Skip to content

feat: v1.3 advanced features + WebSocket transport + batch command#12

Merged
rodaddy merged 6 commits intomainfrom
feat/v1.3-advanced-features
Mar 10, 2026
Merged

feat: v1.3 advanced features + WebSocket transport + batch command#12
rodaddy merged 6 commits intomainfrom
feat/v1.3-advanced-features

Conversation

@rodaddy
Copy link
Copy Markdown
Owner

@rodaddy rodaddy commented Mar 9, 2026

Summary

  • v1.3 Advanced Features (Phases 15-19): Schema caching with drift detection, access control with allow/block lists, gateway resilience with circuit breaker, output formats (table/yaml/csv/ndjson), skill auto-regeneration
  • WebSocket transport: Persistent connections via SDK's WebSocketClientTransport with stdio fallback
  • Batch command: mcp2cli batch reads NDJSON from stdin for multi-call execution (sequential or --parallel)
  • Security hardening: Path traversal protection, ReDoS mitigation, state validation, YAML key escaping, format exhaustiveness guard
  • Examples: Real generated skills for qmd/n8n/vaultwarden-secrets, example configs for all backend types
  • Test isolation: Fixed daemon tests leaking cache artifacts

Stats

Metric Value
Tests 703 pass, 0 fail
Assertions 1,606
Source files 81
TypeScript errors 0
New tests 252
Lines added ~6,400

Test plan

  • 703 unit/integration tests passing
  • 0 TypeScript errors (strict mode)
  • 4-agent verification swarm (code review, security, quality, coverage)
  • 11 real-world integration tests against live MCP servers (qmd, n8n, vaultwarden-secrets)
  • CI pipeline (typecheck + tests + build) -- triggered by this PR

🤖 Generated with Claude Code

rodaddy and others added 5 commits March 9, 2026 18:44
…ay resilience, output formats, skill auto-regen

Phase 15 - Schema Intelligence (ADV-01, ADV-02):
- Schema caching with 24hr TTL at ~/.cache/mcp2cli/schemas/
- SHA-256 drift detection with per-tool change alerts
- Cache clear/status commands, --fresh bypass flag

Phase 16 - Access Control & Discovery (ADV-05, INFRA-03):
- allowTools/blockTools glob patterns in services.json
- Cross-service tool search via `mcp2cli grep`
- TOOL_BLOCKED errors for policy-restricted tools

Phase 17 - Gateway Resilience (INFRA-01, INFRA-02):
- HTTP gateway with stdio fallback config
- Circuit breaker (5 failures, 60s cooldown, disk-persisted)

Phase 18 - Output Formats (ADV-03, ADV-04):
- --format table/yaml/csv/ndjson alongside default json
- Errors always JSON regardless of format flag

Phase 19 - Skill Auto-Regeneration (ADV-06):
- Drift-triggered skill file regeneration
- Manual section preservation via MANUAL:START/END markers
- --diff preview flag, access control integration

Security hardening from agent swarm review:
- Path traversal protection on cache/circuit-breaker file paths
- Input validation on schema command service/tool names
- ReDoS mitigation in glob patterns (collapse wildcards, length guard, cache)
- Circuit breaker state enum validation
- YAML key escaping for injection prevention
- Exhaustiveness guard on format switch

678 tests, 0 failures, 0 TypeScript errors across 79 source files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tion

Examples:
- examples/services-basic.json -- single service config
- examples/services-multi.json -- multi-service with access control
- examples/services-http-fallback.json -- HTTP gateway with stdio fallback
- examples/skills/qmd/ -- real generated skill for QMD (6 tools, 247 tokens)
- examples/skills/n8n/ -- real generated skill for n8n (20 tools, 467 tokens)
- examples/skills/vaultwarden-secrets/ -- real generated skill (11 tools, 328 tokens)

Test isolation:
- Fix 4 daemon test files leaking cache artifacts into ~/.cache/mcp2cli/
- All tests now use isolated MCP2CLI_CACHE_DIR via mkdtemp

README:
- Add description fields to all example config blocks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WebSocket transport:
- New "websocket" backend type in services.json config
- Uses SDK's built-in WebSocketClientTransport (persistent connection)
- Fallback to stdio with circuit breaker (same as HTTP)
- Wired into daemon pool, schema, service-help, tool-call, generate-skills

Batch command:
- `mcp2cli batch` reads NDJSON tool calls from stdin
- Executes sequentially by default, --parallel for concurrent execution
- Outputs NDJSON results with service.tool correlation
- Individual errors don't abort the batch

703 tests, 0 failures, 0 TypeScript errors across 81 source files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…solution failure

The barrel re-export from connection/index.ts caused a cascading module
evaluation failure in CI when the SDK's WebSocket module couldn't load.
Direct imports from websocket-transport.ts avoid this -- callers only
load the module when a websocket backend is actually configured.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…kflow

The claude-code-action expands 'sonnet' to 'claude-sonnet-4-6' internally,
which isn't a registered alias in our LiteLLM proxy. Using 'sonnet4.6'
and 'opus4.6' directly matches the LiteLLM model registry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 10, 2026

PR Review in Progress

Tasks

  • Gather context
  • Review security-sensitive files (access control, path traversal, input handling)
  • Review resource management (WebSocket, circuit breaker, connection pool)
  • Review batch command (stdin handling)
  • Review cache storage (file system operations)
  • Review TypeScript strictness and error handling
  • Post findings

View job run

@rodaddy
Copy link
Copy Markdown
Owner Author

rodaddy commented Mar 10, 2026

@claude re-review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 10, 2026

Claude finished @rodaddy's task —— View job


PR Review

Tasks

  • Gather context
  • Review security-sensitive files (access control, path traversal, input handling)
  • Review resource management (WebSocket, circuit breaker, connection pool)
  • Review batch command (stdin handling)
  • Review cache storage (file system operations)
  • Review TypeScript strictness and error handling
  • Post findings

Summary

Overall this is a well-structured PR with good security awareness (path traversal guards, atomic writes, ReDoS mitigation). Below are findings ranging from bugs to hardening opportunities.


Bugs / Logic Errors

1. Path traversal guard is insufficientsrc/cache/storage.ts:15, src/resilience/circuit-breaker.ts:33

The current check only blocks /, \, and ..:

if (service.includes('/') || service.includes('\\') || service.includes('..')) {

A service name like %2F (URL-encoded slash), a null byte (\0), or OS-level names like CON/NUL on Windows could still cause problems. More critically, since this is used in join(dir, ${service}.json), a name with no path separators but special shell characters (e.g., backtick, $()) won't affect the filesystem path itself — but you should use an allowlist instead of a blocklist:

if (!/^[a-zA-Z0-9_\-]+$/.test(service)) {
  throw new Error(`Invalid service name: "${service}"`);
}

2. Uncleaned timer in WebSocket transportsrc/connection/websocket-transport.ts:55-67

rejectAfter() creates a setTimeout that is never cleared when client.connect() wins the race. This leaks the timer for the full timeout duration (30 seconds by default). Fix:

function rejectAfter(ms: number, url: string): Promise<never> {
  let timer: ReturnType<typeof setTimeout>;
  return Object.assign(
    new Promise<never>((_, reject) => {
      timer = setTimeout(() => reject(...), ms);
    }),
    { cancel: () => clearTimeout(timer) }
  );
}

Or wrap the entire race in a helper that clears the timeout on resolution.

3. Health-check timer also leakssrc/daemon/pool.ts:317-325

The isHealthy() Promise.race has the same uncleaned timeout issue. The timeout setTimeout fires even after listTools() resolves, holding a reference until it fires.

4. parseBatchInput accepts unbounded inputsrc/cli/commands/batch.ts:122-128

readStdin() buffers the entire stdin into memory with no size limit. A malicious or misconfigured pipe could send gigabytes. Add a reasonable max (e.g. 10MB):

const MAX_BATCH_BYTES = 10 * 1024 * 1024;
let totalBytes = 0;
for await (const chunk of process.stdin) {
  totalBytes += chunk.length;
  if (totalBytes > MAX_BATCH_BYTES) throw new Error("Batch input too large");
  chunks.push(chunk as Buffer);
}

5. regexCache is a module-level unbounded Mapsrc/access/filter.ts:8

Patterns come from config, so this is low risk in practice, but if patterns change at runtime the cache grows without bound. Consider a small LRU cap or clear on config reload.

6. batch.ts parallel mode has no concurrency limitsrc/cli/commands/batch.ts:112-117

Promise.all(specs.map(executeSingle)) spawns all calls simultaneously. For large batches this could overload the daemon or downstream MCP servers. Consider chunking with p-limit or a simple semaphore.


Security Issues

7. Service name used in log output without sanitization

src/daemon/pool.ts:134 logs serviceName directly. While this comes from config (trusted), the service name also appears in error messages returned to callers (e.g. src/daemon/pool.ts:88). If service names could ever come from user input in future, ensure they're treated as data, not format strings.

8. batch.ts: service and tool fields from stdin appear in log/output without length limitsrc/cli/commands/batch.ts:40-50

parsed.service and parsed.tool from untrusted stdin flow directly into BatchResult that's printed to stdout. Very long values (e.g. 1MB service name) won't break anything today but would bloat output files. Consider capping field lengths.

9. YAML key escaping covers values but not all key edge cases

src/format/yaml.ts:43-48 uses needsQuoting() for keys via formatKey(), which is correct. However, the YAML_SPECIAL_CHARS regex at line 10 uses a character class that includes > but misses - (dash at start of line is a YAML list indicator). Keys starting with - would produce invalid YAML. E.g., key -foo would render as -foo: value which YAML parsers interpret as a list item, not a mapping key.


Resource Leaks

10. WebSocket connection leaked when pool rejects at capacitysrc/daemon/pool.ts:86-91

When this.connections.size >= this._maxSize, the code throws before connecting — that's fine. But if connectFn() succeeds and then the .then() callback throws for any other reason, the opened McpConnection is not closed. The .then() / rejection handler at line 131-136 rethrows without closing, so connections from successful sub-calls that encounter secondary failures may leak.

11. Drift hook writes cache even when drift check fails mid-way

In src/daemon/drift-hook.ts:103, writeCache(serviceName, liveSchemas) is called unconditionally at the end. If autoRegenerateSkills throws (it shouldn't, it's wrapped), the write still happens. More importantly: if the live tool list is fetched but the hash step fails for one tool (unlikely with crypto.subtle), the liveSchemas array would be incomplete, and the partial result would be cached. Consider only writing cache after all hashes are confirmed.


Error Handling

12. handleBatch doesn't set non-zero exit code on partial failure

src/cli/commands/batch.ts:134-180 — When some batch entries fail, the function logs errors inline as NDJSON lines but exits with code 0 (the default). Users piping batch output won't know some calls failed without parsing every output line. Consider process.exitCode = 1 if any entry had errors.

13. checkDriftOnConnect swallows all errors silentlysrc/daemon/drift-hook.ts:104-107

This is by design ("fire-and-forget"), but the catch only logs a warning. If autoRegenerateSkills writes partial files before failing, there's no cleanup. This is acceptable given the try/catch in autoRegenerateSkills itself, but worth noting.

14. clearCache ignores non-ENOENT errorssrc/cache/storage.ts:163

The catch in clearCache (line 163) swallows all readdir errors including permission errors, treating them the same as "directory doesn't exist." A permission error silently returns 0 cleared. This could confuse users running mcp2cli cache clear.


TypeScript Strictness

15. as Buffer cast in readStdinsrc/cli/commands/batch.ts:124

chunk from process.stdin async iteration is typed as any in Node types, so casting to Buffer is reasonable, but Buffer.concat expects Buffer[] and Uint8Array chunks from Bun may not be exactly Buffer. Use Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk).

16. as Record<string, unknown> in batch.ts — line 39

JSON.parse() returns any. The code does parsed.service and parsed.tool without first asserting parsed is an object. null is typeof "object", so parsed.service on a JSON null would throw Cannot read properties of null. A check like if (!parsed || typeof parsed !== 'object') before accessing fields would be more robust.

17. formatArrayCsv type narrowingsrc/format/csv.ts:59

if (typeof data[0] !== "object" || data[0] === null) {
  return formatPrimitiveCsv(data);
}
return formatArrayCsv(data as Record<string, unknown>[]);

Only the first element is checked; the rest of the array could contain mixed types (e.g. [{a:1}, "string"]). The cast to Record<string,unknown>[] at line 63 is not fully safe. Same pattern in src/format/table.ts:65.


Minor Observations

  • getCircuitBreakerDir: the comment says "go up one level for sibling" but this is fragile if MCP2CLI_CACHE_DIR doesn't end with schemas. If users point MCP2CLI_CACHE_DIR to a non-schemas path, the circuit breaker dir would land in the wrong place.
  • resolveState doesn't mutate state: correctly pure — but callers (recordFailure) call resolveState before modifying state.state, meaning the effectiveState reflects the old state. This is correct but subtle — a comment explaining the ordering would help.
  • parseExistingTools in diff.ts:44: the table detection using trimmed.startsWith("| Tool |") is fragile against case variations — | TOOL | would not match. Consider .toLowerCase() comparison.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 10, 2026

Claude finished @rodaddy's task —— View job


PR Review

Tasks

  • Gather context
  • Review security-sensitive files (access control, path traversal, input handling)
  • Review resource management (WebSocket, circuit breaker, connection pool)
  • Review batch command and stdin handling
  • Review cache storage and file system operations
  • Review TypeScript strictness and error handling
  • Review format output modules
  • Post findings

Summary

This is a well-structured PR with clear architecture and good security awareness throughout. The code is readable and consistently organized. Findings below range from confirmed bugs to hardening opportunities.


Bugs / Logic Errors

1. Path traversal guard uses blocklist instead of allowlistsrc/cache/storage.ts:15, src/resilience/circuit-breaker.ts:33

The current check blocks specific characters but a name like %2F, \0 (null byte), or platform-reserved names (e.g. CON, NUL on Windows) can bypass it. Since service names are joined into filesystem paths, an allowlist is strictly safer:

if (!/^[a-zA-Z0-9_\-]+$/.test(service)) {
  throw new Error(`Invalid service name: "${service}"`);
}

2. rejectAfter timer is never clearedsrc/connection/websocket-transport.ts:55-67

When client.connect() wins the Promise.race, the 30-second setTimeout inside rejectAfter remains scheduled and fires silently after the timeout elapses. In environments with long-running pools this holds the event loop open unnecessarily. The standard fix is to keep a timer handle and clear it in a .finally() or by wrapping both sides of the race:

async function withTimeout<T>(promise: Promise<T>, ms: number, url: string): Promise<T> {
  let timer: ReturnType<typeof setTimeout> | undefined;
  const timeout = new Promise<never>((_, reject) => {
    timer = setTimeout(() => reject(new ConnectionError(`timed out after ${ms}ms`, `url: ${url}`)), ms);
  });
  try {
    return await Promise.race([promise, timeout]);
  } finally {
    clearTimeout(timer);
  }
}

3. isHealthy() has the same uncleaned timersrc/daemon/pool.ts:317-325

The new Promise<never> in isHealthy creates a setTimeout that outlives the Promise.race. If listTools() resolves quickly, the timer still fires after _healthCheckTimeoutMs. Same fix applies.

4. parseBatchInput: no size guard on stdinsrc/cli/commands/batch.ts:122-128

readStdin() buffers all of stdin into memory. A misconfigured pipe or malicious input could send gigabytes before EOF. Add a size limit:

const MAX_BATCH_BYTES = 10 * 1024 * 1024; // 10MB
let totalBytes = 0;
for await (const chunk of process.stdin) {
  totalBytes += (chunk as Buffer).length;
  if (totalBytes > MAX_BATCH_BYTES) {
    throw new Error("Batch input exceeds 10MB limit");
  }
  chunks.push(chunk as Buffer);
}

5. batch.ts parallel mode: no concurrency capsrc/cli/commands/batch.ts:112-117

Promise.all(specs.map(executeSingle)) fires all calls simultaneously. For large batches (hundreds of specs) this can overwhelm the daemon's connection pool. Consider a lightweight semaphore or chunking (e.g. p-limit if already a dependency, or a simple manual queue).

6. parseBatchInput: parsed could be nullsrc/cli/commands/batch.ts:39-43

JSON.parse can return null for the input "null". The check !parsed.service will throw TypeError: Cannot read properties of null before the meaningful validation error is returned. Guard first:

if (typeof parsed !== "object" || parsed === null || Array.isArray(parsed)) {
  results.push({ error: `Expected a JSON object on line ${i + 1}`, line: i + 1 });
  continue;
}

Security Issues

7. clearCache silently swallows permission errorssrc/cache/storage.ts:163

The catch block treats all readdir errors (including EACCES, EPERM) as "directory doesn't exist" and returns 0. A user running mcp2cli cache clear who has a permissions issue will see no cleared files and no error — silent failure at a security-relevant boundary. Consider re-throwing errors that are not ENOENT:

} catch (err) {
  if ((err as NodeJS.ErrnoException).code !== "ENOENT") throw err;
  return 0;
}

8. YAML_SPECIAL_CHARS misses leading dashsrc/format/yaml.ts:10

The regex const YAML_SPECIAL_CHARS = /[:#\[\]{}&*!|>'"%@\,?\]/does not include-. A key or value starting with -(e.g.-foo) will not be quoted, but YAML parsers interpret a line starting with - as a sequence item. This can produce invalid YAML when output is fed to downstream tools. A minimal addition: include-in the character class or add a specific start-of-string check inneedsQuoting`:

if (value.startsWith("-") || value.startsWith(".")) return true;

9. regexCache is an unbounded module-level Mapsrc/access/filter.ts:8

In a long-running process where configs are dynamically reloaded, this cache grows without bound. Since service configs are trusted this is low severity, but worth noting if the daemon is ever embedded in a server process.

10. getCircuitBreakerDir: fragile path derivationsrc/resilience/circuit-breaker.ts:57-59

The comment says "go up one level for sibling," but dirname(cacheBase) assumes MCP2CLI_CACHE_DIR always ends with a path component named schemas. If a user sets MCP2CLI_CACHE_DIR=/tmp/mcp-cache, the circuit breaker dir becomes /tmp/circuit-breaker — potentially a surprising location. The two cache systems should either share the same root resolution logic or this assumption should be documented clearly.


Resource / Connection Leaks

11. connectPromise rejection path doesn't close partially-opened connectionssrc/daemon/pool.ts:131-136

The rejection handler for connectPromise deletes the service from pending and rethrows, but if connectFn() succeeded internally (e.g. an HTTP connection opened) and then the .then() callback threw, the opened McpConnection would not be closed. This path is unlikely but worth hardening with a finally clause.

12. drift-hook.ts: writeCache called even when partial hashing failssrc/daemon/drift-hook.ts:103

writeCache(serviceName, liveSchemas) is called unconditionally after the drift comparison. If hashToolSchema rejected for one tool (unlikely given crypto.subtle is robust, but possible under memory pressure), liveSchemas would be a partial array and that partial result would be persisted as the authoritative cache. Consider only writing if all schemas were successfully hashed.


TypeScript / Type Safety

13. as Buffer cast in readStdinsrc/cli/commands/batch.ts:124-125

Node's async iteration over process.stdin yields Buffer | string (or Uint8Array in Bun). The cast chunk as Buffer skips the runtime check. Use Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk as Uint8Array) to be safe across runtimes.

14. Unsafe first-element check for array type discriminationsrc/format/csv.ts:59, src/format/table.ts:65

if (typeof data[0] !== "object" || data[0] === null) {
  return formatPrimitiveCsv(data);
}
return formatArrayCsv(data as Record<string, unknown>[]);

Only data[0] is checked before casting the entire array. Mixed arrays like [{a: 1}, "string", 42] will pass this check and then fail at runtime when formatArrayCsv tries to call Object.keys on a string. If mixed arrays are plausible inputs, use data.every(item => typeof item === "object" && item !== null).

15. as Record<string, unknown> cast on data in YAMLsrc/format/yaml.ts:92

serializeObject(data as Record<string, unknown>, indent) — the check typeof data === "object" passes for Date, RegExp, Map, and Set, all of which have typeof === "object" but won't serialize correctly as plain records. Not a bug for the documented use case (JSON-like data), but worth noting if the formatter is ever used more broadly.


Minor Observations

  • parseExistingTools in diff.ts:44-46: table detection checks trimmed.startsWith("| Tool |") || trimmed.startsWith("| tool |") but still misses | TOOL | or other mixed-case variants. Given this is user-generated content (SKILL.md), a .toLowerCase() comparison would be more robust.

  • batch.ts exits with code 0 on partial failure: When some batch entries fail, results are emitted as NDJSON with success: false but the process exits normally. Callers piping batch output may not parse every line. Consider setting process.exitCode = 1 if any entry fails.

  • resolveState comment note (src/resilience/circuit-breaker.ts:182-183): resolveState is called on the state before state.state is mutated. This means effectiveState reflects the old persisted state, not the incremented failure count's perspective. The logic is correct (you want to know if you were in half-open when the failure came in), but it reads as potentially backwards without the comment. The existing comment is good — no change needed, just flagging.

  • autoRegenerateSkills uses empty inputSchemasrc/generation/auto-regen.ts:96: inputSchema: {} as object is a placeholder. If downstream code ever tries to use these schema objects for more than display, this will silently produce empty schemas. Worth a TODO comment.


@rodaddy rodaddy merged commit d825ae4 into main Mar 10, 2026
3 checks passed
@rodaddy rodaddy deleted the feat/v1.3-advanced-features branch March 10, 2026 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant