feat: support JSON-native union inputs#3048
Conversation
Add support for union input types such as `str | float` and `str | float | None`. Unions are restricted to JSON-native members (str, int, float, bool, dict/Any, list[T], None) so request validation happens at the HTTP edge against the OpenAPI schema. Unions involving Path, File, Secret, custom coders, and BaseModel are rejected at build/schema-generation time, and output unions remain unsupported. - pkg/schema: add a recursive InputType model and resolver, emit OpenAPI anyOf for union inputs, and keep multi-variant nullable unions required when no default is supplied - pkg/predict: parse numeric CLI values for unions that accept a number (schemaAcceptsNumber), and fall back to a string member when a numeric parse fails for number-first unions like `float | str` (schemaAcceptsString) - python/cog/_adt: add deterministic union normalisation with strict per-member compatibility (bool not int/float, scalars not dict/Any, list unions validate elements) and anyOf json_type emission - tests: Go unit tests, Python regression tests, and end-to-end txtar integration tests for HTTP, CLI, and list unions - docs: document union inputs and nullable semantics
|
I'm Bonk, and I've done a quick review of your PR. Adds support for JSON-native union input types with recursive Issues found (2 related bugs in CLI union parsing):
Both bugs stem from the Suggested fixes: Add a func schemaAcceptsFloat(s *openapi3.Schema) bool {
if s == nil {
return false
}
if s.Type != nil && s.Type.Is("number") {
return true
}
for _, ref := range s.AnyOf {
if ref.Value != nil && schemaAcceptsFloat(ref.Value) {
return true
}
}
for _, ref := range s.AllOf {
if ref.Value != nil && schemaAcceptsFloat(ref.Value) {
return true
}
}
return false
}In the case propertySchema.Type.Is("integer"):
value, err := strconv.ParseInt(val, 10, 32)
if err != nil {
// See the number case above: fall back to a string
// member for unions such as `int | str`.
if schemaAcceptsString(originalSchema) {
break
}
// Also fall back to a float member for unions such as `int | float`.
if schemaAcceptsFloat(originalSchema) {
if value, err := strconv.ParseFloat(val, 32); err == nil {
float := float32(value)
input[key] = Input{Float: &float}
continue
}
}
return input, err
}
valueInt := int32(value)
input[key] = Input{Int: &valueInt}
continueIn the case schemaAcceptsNumber(originalSchema):
// Union input (anyOf) that includes a numeric member, e.g.
// `str | float`. Parse numeric-looking values as numbers so
// the runtime receives the intended type; otherwise fall
// through to the string member below.
if value, err := strconv.ParseInt(val, 10, 32); err == nil {
valueInt := int32(value)
input[key] = Input{Int: &valueInt}
continue
}
if schemaAcceptsFloat(originalSchema) {
if value, err := strconv.ParseFloat(val, 32); err == nil {
float := float32(value)
input[key] = Input{Float: &float}
continue
}
}Please also add unit tests for |
|
Posted a review comment to PR #3048 identifying 2 related CLI parsing bugs for union inputs:
Both stem from |
Address two related bugs in CLI `-i` parsing of union inputs: - `int | float` resolves to the integer member first, so a fractional value like `1.5` failed ParseInt and errored instead of falling back to the float member. - `str | int` resolves to the string member, then the schemaAcceptsNumber branch parsed `1.5` as a float even though the union only accepts an integer, sending an invalid float. Add a schemaAcceptsFloat helper that matches number members but not integer-only members, and gate float parsing behind it in both the integer branch (with a float fallback) and the schemaAcceptsNumber branch. Add unit tests for `int | float` and `str | int` unions.
|
@markphelps Bonk workflow was cancelled. View workflow run · To retry, trigger Bonk again. |
| | `list[T]` | `{"type": "array", "items": {...}}` | | | ||
| | `Optional[T]` | Type T + not in `required` | Input fields only | | ||
| | `A \| B` / `Union[A, B]` | `{"anyOf": [A, B]}` | Input-only, JSON-native unions only | | ||
| | `A \| None` / `Optional[A]` | Type A + `nullable: true` | Input fields may still be required if no default is supplied | |
There was a problem hiding this comment.
so it's always better to include the default = None if an input filed is optional right? even if we have Optional[A] defined, otherwise the schema might not pick them up as optional?
I don't recall what I did with optional fields previously but I always had default=something, guess it's not a problem with me but just asking
There was a problem hiding this comment.
Good question, and you've actually hit on a real subtlety.
Short answer: Optional[T] / T | None does not need default=None. What it needs is some = Input(...) assignment. The Input(...) is what makes the field omittable; the None default is implicit for optionals.
BUT doing value: Optional[str] (bare, no Input) generates the openAPI schema correctly (not required, nullable: true), but it fails at runtime with no argument with TypeError: missing 1 required positional argument.. so that is actually a preexisting bug that we've been papering over with =Input() which we should fix!
But for now will update the docs so we are consistent everywhere and look into what it will take to actually fix or atleast make it more clear as to why it doesnt work as expected with a better error message or something
The previous table conflated plain single-type optionals with multi-variant nullable unions. A plain Optional[T] / T | None is never placed in required, while a multi-variant union like A | B | None stays required unless a default is supplied. Split these into separate rows and add a runtime caveat: an optional needs a Python-level default (via = Input(...) or default=None) so an omitted value resolves to None; a bare Optional[T] annotation raises TypeError when omitted.
|
LGTM |
The union-inputs feature (#3048) added input union support, but the "Type limitations" section still claimed only Optional[T] was supported, contradicting the new Union section. Scope the limitation correctly: output unions remain unsupported, JSON-native input unions are supported, and input unions of Path/File/Secret/custom-coder/BaseModel members fail at build. Also add the missing Union table-of-contents entry and regenerate llms.txt.
* test: add fuzzing for input type resolution and OpenAPI generation Add fuzz coverage for the schema-gen codepaths exercised by union inputs, which previously had no fuzzing and no validity oracle: - FuzzResolveInputType: feeds arbitrary TypeAnnotation trees through ResolveInputType, then validates that any successfully resolved input type generates an OpenAPI document the build-time validator accepts. - FuzzInputTypeJSONSchema: builds arbitrary InputType trees directly (reaching shapes the resolver never produces) and validates the generated OpenAPI document. Both use an assertValidOpenAPI oracle (the same kin-openapi validator as writeAndValidateSchema), so a union shape that resolves cleanly but emits an invalid schema (e.g. an unsupported `type: null` branch) fails the fuzzer rather than surfacing as a confusing user build error. Make the test:fuzz task auto-discover every Fuzz* target via `go test -list` instead of hardcoding names, so new fuzz tests are picked up automatically. This also surfaced targets the hardcoded list missed (pkg/config's three targets and two parser helpers): 11 total vs the 4 previously run. Manage jq (used for discovery) as a mise tool and simplify the CI fuzz-go job to call the task. * docs: correct union input type support in python.md The union-inputs feature (#3048) added input union support, but the "Type limitations" section still claimed only Optional[T] was supported, contradicting the new Union section. Scope the limitation correctly: output unions remain unsupported, JSON-native input unions are supported, and input unions of Path/File/Secret/custom-coder/BaseModel members fail at build. Also add the missing Union table-of-contents entry and regenerate llms.txt. * test: use testify require in assertValidOpenAPI Replace raw t.Fatalf with require.NoError per the project testing conventions (AGENTS.md).
* origin/main: Bump version to 0.21.0-rc.3 (#3050) test: fuzz schema generation for union inputs + docs fixup (#3049) feat: support JSON-native union inputs (#3048) ci: pin mise version in release workflows (#3046) chore: regen lockfile Bump version to 0.21.0-rc.2 (#3045) fix: mount weights in cog serve like cog run does (#3044)
What
Adds support for JSON-native union input types, e.g.:
Unions are restricted to JSON-native members (
str,int,float,bool,dict/Any,list[T],None) so request validation happens at the HTTP edge against the OpenAPI schema. Unions involvingPath,File,Secret, custom coders, andBaseModelare rejected at build/schema-generation time. Output unions remain unsupported.How
pkg/schema— recursiveInputTypemodel + resolver; emits OpenAPIanyOffor union inputs. Multi-variant nullable unions stayrequiredwhen no default is supplied; non-union optionals are unchanged.pkg/predict— CLI-iparsing inspectsanyOfmembers: numeric values parse as numbers for unions that accept one (schemaAcceptsNumber), and a failed numeric parse falls back to the string member for number-first unions likefloat | str(schemaAcceptsString).python/cog/_adt— deterministic union normalisation with strict per-member compatibility (bool is not int/float, scalars don't pass throughdict/Any, list unions validate each element);anyOfjson_typeemission.architecture/02-schema.mdanddocs/python.mddocument supported members,anyOf, and nullable semantics.Behavioral notes
-i value=1forstr | floatstays anint(consistent with how a plainfloatfield already receives a Pythonintfor-i num=10); only fractional values become floats.nullable: truelets an omitted optional fall back to its default. An explicit JSONnullis still validated against the field type and rejected at the HTTP edge — identical to every other optional field today.Testing
pkg/schema/...,pkg/predict/...python/tests/test_adt.py(union ordering, bool/int/float overlap, list-element validation, optional/requiredNonehandling, empty list, mixed scalar+list)union_input_http,union_input_cli(both member orderings),union_input_list_httpAll passing locally;
lint:goclean;ruffclean.