Skip to content

feat(document-api): customXml.parts.* — list/get/create/patch/remove (SD-3105)#3245

Open
caio-pizzol wants to merge 6 commits into
mainfrom
caio-pizzol/SD-3105-custom-xml-parts-api
Open

feat(document-api): customXml.parts.* — list/get/create/patch/remove (SD-3105)#3245
caio-pizzol wants to merge 6 commits into
mainfrom
caio-pizzol/SD-3105-custom-xml-parts-api

Conversation

@caio-pizzol
Copy link
Copy Markdown
Contributor

Adds a public API for the OOXML Custom XML Data Storage Part (ECMA-376 §15.2.5, §15.2.6, §22.5). Customers who today reach into the converter to anchor structured payloads to a document (citation metadata, license data, AI review records) now have an official path: editor.doc.customXml.parts.list/get/create/patch/remove.

  • API shape is the OOXML-aligned naming we landed on after extensive design review: customXml.parts.*, object inputs, target: { id } | { partName }, list returns summaries (no content), get returns the full record, partName fallback for foreign parts that ship a Storage Part without a matching Properties Part.
  • Implementation coordinates the five OOXML package files on every write (storage part, props part, item rels, document rel, content-types inherited) and tombstones removed paths via converter.removedCustomXmlPaths so the exporter nulls original-zip entries when the customer removes a previously-imported part.
  • Lifecycle goes through executeOutOfBandMutation, the same primitive citation sources use — expectedRevision, dryRun, dirty marking, GUID promotion, revision increment all wired. REVISION_MISMATCH and other lifecycle errors propagate; only content-parsing errors are caught and mapped to INVALID_INPUT.
  • partName targets are restricted to actual storage parts (customXml/itemN.xml) so the low-level API can't read or mutate unrelated package files like word/document.xml.
  • Storage-to-props pairing uses customXml/_rels/itemN.xml.rels per spec, with the index-name heuristic only as a fallback for foreign parts without a rels file.

Must stay the same: ECMA-376 alignment on names and shapes — customXml.parts (not metadata), id = itemID GUID, partName = package path, rootNamespace vs schemaRefs kept distinct.

Review: check that the executeOutOfBandMutation wiring matches the citation-sources adapter pattern, and that the tombstone strategy doesn't leak to non-customXml paths. Ignore the displacedByCustomXml.js typecheck warnings — pre-existing on main.

Verified: pnpm --filter @superdoc/super-editor exec vitest run src/editors/v1/document-api-adapters → 3401/3401 pass; pnpm --filter @superdoc/document-api test → 1428/1428 pass; round-trip integration test exports a created part, reimports through the canonical loader, and asserts itemID/content/schemaRefs survive.

Locks in the public API surface for Custom XML Data Storage Parts:
- Types + validators (customXml/customXml.types.ts, customXml.ts)
- 5 operation definitions in operation-definitions.ts
- Registry entries in operation-registry.ts
- Dispatch entries in invoke.ts
- DocumentApi.customXml + adapter slot in index.ts
- Re-exports in package barrel

Adapter implementation, plan-engine wrapper, OOXML package writer,
and tests are pending. Two known typecheck failures left:
reference-doc-map.ts (needs customXml group entry) and schemas.ts
(needs JSON schemas for 5 ops).
…105)

Completes the read path through the Document API and closes the
remaining contract-layer wiring gaps.

Contract:
- reference-doc-map.ts: customXml group title/description/page entry
- schemas.ts: 5 op JSON schemas + customXmlPartTargetSchema helper
- 30 validator tests passing (target xor id/partName, content
  well-formedness smell-test, schemaRefs string[] check, patch
  requires at-least-one)

Read adapter:
- super-converter/custom-xml-parts.js: discovery, parsing, serialization
  helpers (listCustomXmlStoragePartNames, parsePropsPart, readCustomXmlPart)
- plan-engine/custom-xml-wrappers.ts: list/get adapter routing through
  buildDiscoveryItem/Result, filters by rootNamespace and schemaRef,
  partName-targeting fallback for foreign parts without Properties Parts
- assemble-adapters.ts: customXml plugged in alongside bookmarks
- 10 integration tests passing (list empty, list with filter, get by
  id, get by partName fallback, get unknown id returns null)

Write adapter:
- create/patch/remove return CAPABILITY_UNAVAILABLE pending Phase B
  (OOXML package file coordination)
…s (SD-3105)

create / patch / remove now implement the full OOXML package
coordination instead of returning CAPABILITY_UNAVAILABLE.

Write adapter (super-converter/custom-xml-parts.js):
- createCustomXmlPart: generates fresh GUID itemID, allocates next
  free index, writes Storage Part + Properties Part + item rels +
  document-level relationship (5-file coordination)
- patchCustomXmlPart: resolves by id or partName, replaces content
  and/or schemaRefs, preserves itemID. Creates a Properties Part
  on the fly when patching schemaRefs into a foreign part that
  doesn't have one yet.
- removeCustomXmlPart: deletes storage, props, item rels, and the
  document-level relationship pointing at the part.

Adapter wrappers (plan-engine/custom-xml-wrappers.ts):
- All three writers call rejectTrackedMode (matches the contract
  declaration of supportsTrackedMode: false).
- Errors map cleanly to INVALID_INPUT / TARGET_NOT_FOUND.
- supportsDryRun set to false for v1; dry-run support can come later.

Conformance:
- contract-conformance.test.ts: throw/apply vectors registered for
  all three customXml.parts mutating ops.
- contract.test.ts: customXml added to the validGroups list.

Coverage:
- 16 integration tests passing (read + write + round-trip).
- 1195/1195 conformance tests passing.
- 3392/3392 across the full document-api-adapters suite.
- 1428/1428 across the document-api package suite.

Round-trip test exports a created part to DOCX, reimports through
the canonical loader, and verifies the itemID GUID, rootNamespace,
schemaRefs, and content all survive. The 5-file package
coordination is empirically OOXML-faithful.
…tombstones (SD-3105)

All six findings from the SD-3105 review:

#1 (High) partName scoping
- resolveTargetPartName and readCustomXmlPart now require the path to
  match customXml/itemN.xml. Targets like word/document.xml are rejected
  cleanly instead of letting through.

#2 (High) bypassed mutation lifecycle
- create/patch/remove now route through executeOutOfBandMutation, the
  same shared primitive citation sources use. Each call gets:
    * expectedRevision check
    * dryRun preview path
    * dirty marking + GUID promotion
    * revision increment on actual change
- supportsDryRun: true on all three ops with real dry-run validation
  (well-formedness for create/patch, target resolution for patch/remove).

#3 (High) deletion didn't persist for imported DOCX parts
- removeCustomXmlPart now stamps the removed paths into a
  converter.removedCustomXmlPaths set. Editor.ts export loop emits
  updatedDocs[path] = null for each entry, so original-zip entries are
  tombstoned instead of being copied through.

#4 (Medium) props parts paired by filename
- findPropsPartFor now reads customXml/_rels/itemN.xml.rels and follows
  the Type=customXmlProps relationship. Falls back to the index-name
  heuristic only when no rels file exists. Foreign docs with non-
  matching names now resolve correctly.

#5 (Medium) contract metadata lied about failures
- possibleFailureCodes updated to actual codes: ['INVALID_INPUT'],
  ['TARGET_NOT_FOUND', 'INVALID_INPUT'], ['TARGET_NOT_FOUND'].

#6 (Medium) JSON schemas didn't match runtime
- get output now { oneOf: [{ type: 'object' }, { type: 'null' }] }.
- patch input encodes 'at least one of content or schemaRefs' via
  anyOf, with additionalProperties: false.
- content fields gain minLength: 1.

Coverage update:
- Two new integration tests assert #1 (partName rejection on
  word/document.xml etc) and #4 (foreign-name props resolved via rels).
- failureCase and dryRun vectors added for all three customXml.parts
  mutating ops in the conformance suite.

Verified:
- @superdoc/super-editor: 3398/3398 across 123 files
- @superdoc/document-api: 1428/1428 across 51 files
… (SD-3105)

Two correctness fixes from the second review pass:

#1 Broad catch was swallowing REVISION_MISMATCH
  customXml.parts.create and patch wrapped the entire
  executeOutOfBandMutation call in a try/catch that converted
  everything to INVALID_INPUT. Lifecycle errors from
  checkRevision (REVISION_MISMATCH) and any future PlanError
  propagation were being eaten.

  Replaced the outer try/catch with a scoped safeValidate helper
  that only catches content-parsing errors from createCustomXmlPart
  / patchCustomXmlPart, returning them as structured INVALID_INPUT
  outcomes. The executeOutOfBandMutation call itself now lets
  revision and other lifecycle errors bubble.

  Also reordered patch: target resolution runs FIRST, so a missing
  target reports TARGET_NOT_FOUND instead of (potentially)
  INVALID_INPUT if patch happened to throw.

#2 Tombstone could null a newly-created part on the same index
  remove → create on a recycled index (customXml/item1.xml) had
  this sequence: remove writes 'customXml/item1.xml' to the
  tombstone set; create reuses index 1 because convertedXml has
  no item1.xml; export serializes the new part, then the tombstone
  loop runs and overwrites updatedDocs['customXml/item1.xml'] with
  null, deleting the freshly-created part from the exported zip.

  Fix: createCustomXmlPart now removes its written paths
  (partName, propsPartName, itemRelsPath) from
  converter.removedCustomXmlPaths whenever a converter is passed.
  The new integration test exercises the exact remove → create →
  export → reimport sequence and asserts the new part survives
  with its fresh id and content.

Coverage:
- 19/19 integration tests passing (incl. the new tombstone test).
- 3401/3401 super-editor document-api-adapters tests.
- 1428/1428 document-api package tests.
@caio-pizzol caio-pizzol requested a review from a team as a code owner May 12, 2026 12:43
@linear
Copy link
Copy Markdown

linear Bot commented May 12, 2026

SD-3105

@github-actions
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fb03693e92

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +529 to +531
const removedPaths = [partName, propsPartName, itemRelsPath].filter(
(path) => typeof path === 'string' && path.length > 0,
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove stale custom XML content type overrides

When removing an imported customXml part that had a /customXml/itemPropsN.xml Override, these tombstones delete only the storage, props, and rels entries. I checked the export path in DocxZipper.updateContentTypes: it adds itemProps overrides when present and only prunes stale comment overrides, so the original [Content_Types].xml override is copied through for a part that no longer exists. That leaves an invalid DOCX after customXml.parts.remove; include the content-types override in reconciliation or teach export to prune stale customXml itemProps overrides.

Useful? React with 👍 / 👎.

…ntent-types pruning (SD-3105)

Three review findings — each verified by a failing test first, then fixed:

#1 findPropsPartFor used an ad-hoc regex that only handled bare names
   and `/` prefixes. Valid OPC Target forms like `./itemPropsN.xml`
   and `../customXml/itemPropsN.xml` (per RFC 3986 §5.2.4 and ECMA-376
   §9.1.4) silently fell through to the index-name fallback or missed
   the props entirely. Route through resolveOpcTargetPath with
   baseDir='customXml' (the source part's directory). Two new tests
   assert resolution for `./` and `../customXml/` Target forms.

#2 removeCustomXmlPart on the bibliography part left
   converter.bibliographyPart populated. On the next export,
   syncBibliographyPartToPackage(convertedXml, bibliographyPart) wrote
   the cached sources back into convertedXml — resurrecting the
   supposedly-removed part in the in-memory state (the tombstone still
   nulled the exported zip entry, but the editor's own view of the
   document silently re-grew the part). patchCustomXmlPart on the
   bibliography part had the worse variant: cached sources overwrote
   the customer's new content. Both now call
   invalidateConverterCachesForPath, which clears
   converter.bibliographyPart when its partPath matches the part we
   touched. New test exercises remove + exportDocx and asserts the
   convertedXml entry stays gone.

#3 DocxZipper.updateContentTypes only pruned stale Override entries
   for comment parts. After customXml.parts.remove, the original DOCX's
   `<Override PartName="/customXml/itemPropsN.xml" .../>` survived
   in the exported [Content_Types].xml, pointing at a non-existent
   part. The operation's cleanup contract claimed otherwise. Extended
   the stale-override pruning to also cover customXml/itemPropsN.xml
   paths absent from the final zip (i.e. tombstoned via
   updatedDocs[path] = null).

Also clears the now-stale top-of-file docblock on the integration test
that claimed write-side was `CAPABILITY_UNAVAILABLE`-stubbed; the
file actually contains a full write-side suite including round-trip
and bibliography-cache tests.

Verified:
- 22/22 customXml integration tests
- 3404/3404 super-editor document-api-adapters
- 1428/1428 document-api package
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant