Skip to content

provider update fails for legacy records with type > 64 chars #1347

@latenighthackathon

Description

@latenighthackathon

Summary

openshell provider update <name> --credential KEY fails with provider.type exceeds maximum length (N > MAX_PROVIDER_TYPE_LEN) whenever the existing provider record's stored type field exceeds the current MAX_PROVIDER_TYPE_LEN (64 bytes), even though the caller is only mutating credentials. The provider becomes unupdateable: the only recovery is provider delete followed by recreate, which loses any provider.config entries the caller never sees.

Reported by @KodeDaemon in the NVIDIA Developer Discord, surfaced when running nemoclaw update to upgrade from 0.0.38 to 0.0.39.

Reproduction

nemoclaw update runs the maintained installer, which post-install calls openshell provider update inference --credential ... against the existing inference provider. With a stored 79-character type value on that record, the gRPC call fails and the installer exits non-zero:

$ nemoclaw update
  Running maintained NemoClaw installer...
  ...
Error:   × status: InvalidArgument, message: "provider.type exceeds maximum length
  │ (79 > 64)", details: [], metadata: MetadataMap { headers: {"content-type":
  │ "application/grpc", "date": "Tue, 12 May 2026 20:44:53 GMT"} }
  Installer failed with exit 1.

openshell provider list confirms the offending row carries an oversized type value:

$ openshell provider list
NAME                  TYPE                                                                              CREDENTIAL_KEYS  CONFIG_KEYS
inference             xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  1                2

The 79-character type value was written to the gateway store before the current validator rejected it (or via a write path that bypassed normalization, see "Root cause" below). Every subsequent provider update against that name now fails, regardless of whether the caller touches type.

Root cause

crates/openshell-server/src/grpc/provider.rs:153-163 rebuilds the merged Provider from existing.r#type and runs the full validate_provider_fields check over it:

let updated = Provider {
    metadata: existing.metadata,
    r#type: existing.r#type,                   // legacy value, untouched
    credentials: merge_map(existing.credentials, provider.credentials),
    config: merge_map(existing.config, provider.config),
};
validate_provider_fields(&updated)?;           // re-checks immutable type length

The CLI's provider update sends r#type: "" (crates/openshell-cli/src/run.rs:3692) so the existing value is preserved without modification, but the validator still measures it. validate_provider_fields was added in #145 and split into the current module in #777; nothing in the update handler distinguishes between fields the caller is mutating and fields carried forward from the existing record.

Two paths put oversized type values into the store:

  1. Records that predate the field-length validator (fix(server): add field-level size limits to sandbox and provider creation #145).
  2. Write paths that bypass normalize_provider_type. The TUI's spawn_create_provider at crates/openshell-tui/src/lib.rs:1563 forwards the form's provider_type field directly to the CreateProviderRequest without going through the CLI's normalize_provider_type allowlist.

Workaround

openshell provider list                 # find the row with the oversized TYPE column
openshell provider delete <name>        # removes the record (and its provider.config entries)
nemoclaw update                         # re-run the installer; it recreates the provider with a normalized type

Users lose any provider.config entries on the deleted record that NemoClaw does not re-supply on recreate.

Suggested fix

Split the validator into a create-time variant (full check including immutable name/type) and an update-time variant that validates only the fields the caller is mutating (credentials, config). Immutable fields carried forward from existing were valid at creation time, so the update path should trust them rather than re-measure them against current limits.

Fix proposed in #1350.

Metadata

Metadata

Assignees

No one assigned

    Labels

    state:triage-neededOpened without agent diagnostics and needs triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions