feat(mcp): resolve local-element declarations on top-level miss#8
Merged
Merged
Conversation
A real agent calling ooxml_attributes for w:cs, w:rtl, w:lang, w:dir,
w:bdo - all elements that show up in real .docx files - got Not found.
Reason: those elements are declared inline inside EG_RPrBase /
EG_ContentRunContent groups in the WML XSD, not as top-level
xsd:elements, so the global-only lookup misses them. The agent had to
fall back to prose search and reconstruct an answer the schema graph
actually has.
ooxml_attributes / ooxml_children / ooxml_element now fall back through
local-element declarations when the top-level lookup misses:
- Search xsd_symbols for local elements with the same local name in
the same namespace (parent_symbol_id IS NOT NULL).
- If exactly one local declaration exists, or all matching declarations
share the same type_ref, follow that type and return the report. The
header surfaces "resolved via local element in <kind> <name>, type X"
so the agent understands the indirection.
- If multiple declarations have different type_refs (the tblGrid case),
return a disambiguation list - never a guess.
- For ooxml_element specifically, return a local-element report
("## Local element: X") rather than pretending it's a global Element.
Same-namespace local resolution suppresses the cross-vocab did-you-mean
that PR 1 introduced. The trace that motivated this work showed w:cs
surfacing r:cs (a relationship attribute) as a suggestion - technically
true but unhelpful. Local resolution wins; cross-vocab stays as the
last-resort fallback only when no same-namespace local exists.
No DB migration: parent_symbol_id and type_ref were already populated
during ingest. This is a query/dispatch enhancement.
Tests cover both layers:
- Helper level (fixture XSDs): findLocalElementsInNamespace returns
scoped results, excludes top-level, reports parent kind and name,
preserves per-parent type_refs for the ambiguous case.
- Dispatch level: a new fixture (EG_LocalCase containing local_para
typed CT_Para) exercises the single-match resolution end to end,
asserting "resolved via local element" headers and the expected
attributes. The ambiguous-shared fixture confirms disambiguation
without guessing.
- Real-cache acceptance (gated on the full Transitional bundle):
w:cs / w:rtl resolve to CT_OnOff.val; w:lang to CT_Language with
val, eastAsia, bidi; w:dir / w:bdo to their respective types' val
attribute; w:cs no longer leads with the cross-vocab r:cs hint.
- ooxml_element on an ambiguous local name (e.g. fixture w:shared, declared
in CT_OuterA as ST_Jc AND CT_OuterB as xsd:string) previously called
formatLocalElementReport which promoted locals[0] as the canonical type
and listed the rest under "also declared". That implied a primary
answer where none exists. ooxml_element now goes through resolveLocalElement
like the other dispatchers: same single/ambiguous policy, ambiguous
cases use formatLocalElementAmbiguous (no primary), formatLocalElementReport
is reached only when callers proved the locals share one type_ref.
- resolveLocalElement filtered null type_refs out before checking for a
single value. A mix like [null, CT_X] would collapse to {CT_X} and
resolve incorrectly — the inline-typed declaration has its own content
model the type symbol can't represent. Now requires every local hit
to share the same non-null type_ref before resolving as single;
anything else (mixed nulls, multiple non-null values) goes to
ambiguous.
formatLocalElementReport signature changes to take the resolved (first,
locals) explicitly, with a comment documenting the "all share one
type_ref" invariant. "Also declared in N other contexts" now drops the
per-context type_ref (it's the same as the primary) and says "with the
same type" to make the invariant visible to readers.
Tests cover the ooxml_element regression: w:shared (the existing
fixture for type-disagreement) now exercises the ambiguous path,
verifying the single-resolution heading does not appear and no
"Also declared in" footer leaks the first hit as canonical.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A real agent asking
ooxml_attributes w:csgot Not found — butw:csshows up in real.docxfiles. Reason:cs,rtl,lang,dir,bdoare declared inline insideEG_RPrBase/EG_ContentRunContentin the WML XSD, not as top-levelxsd:elements, so the global-only lookup misses them. The agent had to pivot to prose search and reconstruct a structural answer the database actually has.ooxml_attributes,ooxml_children, andooxml_elementnow fall back through local-element declarations when the top-level lookup misses. Conservative scoping: top-level still wins, fallback only on miss, never guesses.findLocalElementsInNamespace(localName, namespace, profile)returns local-element rows with parent kind/name andtype_ref.type_ref, follow that type and return the attribute/children report with a "resolved via local element in<kind><name>, typeX" header.tblGrid-style case), return a disambiguation list. No guess.ooxml_elementspecifically, return a## Local element: Xreport rather than pretending it's a global Element.Same-namespace local resolution suppresses the cross-vocab did-you-mean that PR 1 introduced. The trace that motivated this PR showed
w:cssurfacingr:cs(a relationship attribute) as a suggestion — technically true but unhelpful.No DB migration.
parent_symbol_idandtype_refwere already populated.Acceptance criteria (all covered by tests, green locally against the real Transitional bundle):
ooxml_attributes w:cs/w:rtl→CT_OnOff.valooxml_attributes w:lang→CT_Language.{val, eastAsia, bidi}ooxml_attributes w:dir→CT_DirContentRun.valooxml_attributes w:bdo→CT_BdoContentRun.valVerified: 88 pass / 0 fail / 0 skip. Format / lint / typecheck / build all clean.