Skip to content

fix(super-editor): stop injecting bCs/iCs/highlight defaults on round-trip (SD-2912)#3237

Open
tupizz wants to merge 1 commit into
mainfrom
tadeu/sd-2912-pgmar-roundtrip-drift
Open

fix(super-editor): stop injecting bCs/iCs/highlight defaults on round-trip (SD-2912)#3237
tupizz wants to merge 1 commit into
mainfrom
tadeu/sd-2912-pgmar-roundtrip-drift

Conversation

@tupizz
Copy link
Copy Markdown
Contributor

@tupizz tupizz commented May 11, 2026

Summary

The SD-2912 customer fixture round-trips to a document.xml ~37 KB larger than the source. 2466 redundant XML elements are injected into every run: 822 each of <w:bCs/>, <w:iCs/> and <w:highlight w:val="none"/> — none of which existed in the source rPr. Word treats explicit-off and absence as identical so the rendering doesn't change, but every consumer reading the XML sees thousands of "intentional" toggles that weren't real.

Why

Two paths in styles.js::decodeRPrFromMarks were emitting derived defaults that pollute the output:

  • runProperties.boldCs = runProperties.bold (and italic→italicCs). In OOXML these are independent properties (ECMA-376 §17.3.2.1, §17.3.2.16). Auto-propagating the latin value to the complex-script companion means every bold or italic mark — including default value: false from the style cascade — emits a complex-script element.
  • runProperties.highlight = { 'w:val': 'none' } when the highlight mark's color was 'transparent'. That mark is synthesized from <w:shd val="clear" fill="auto"/> shading on import (see encodeMarksFromRPr's shading case). The shading itself round-trips on its own <w:shd> element, so the synthesized highlight was pure noise.

To keep documents that genuinely carry <w:bCs/> or <w:iCs/> lossless (the auto-propagation was the only mechanism preserving them through the mark layer), boldCs and italicCs are removed from RUN_PROPERTIES_DERIVED_FROM_MARKS in calculateInlineRunPropertiesPlugin. They now go through the "preserve existing runProperties" branch in getInlineRunProperties, surviving appendTransactions via the run's stored runProperties.

Test plan

  • New TDD-style unit tests in styles.test.js covering: bold-off no longer propagates bCs, italic-off no longer propagates iCs, transparent highlight no longer emits.
  • Updated three pre-existing tests that locked in the old propagation:
    • styles.test.js > decodeRPrFromMarks > should decode bold, italic, and strike marks
    • styles.test.js > marks encoding/decoding round-trip > should correctly round-trip basic properties
    • syncParagraphRunProperties.test.js (cursor-mark fallback)
    • getMarksFromSelection.test.js (selection formatting state)
  • New integration test sd-2912-pgmar-roundtrip.test.js that round-trips the customer fixture and asserts zero new <w:bCs/> / <w:iCs/> / <w:highlight/> in document.xml.
  • Full pnpm test: 12 714 / 12 727 pass.
  • Live browser verification: re-exported the customer fixture from the dev app, all three counts went 0 → 0 (was 0 → 822 on main).
  • Opened input + fix-exported in Word side by side — no repair dialog, layout identical.

Verified counts on the customer fixture

Element in document.xml Input Before fix After fix
<w:bCs> 0 822 0
<w:iCs> 0 822 0
<w:highlight> 0 822 0
Total document.xml size 1 093 041 1 130 261 1 077 789

Out of scope (separate follow-ups documented on the ticket)

The SD-2912 fixture has additional drift sources I did not address in this PR:

  • Body-level <w:pgMar> coercion: 56 of 57 <w:pgMar> elements preserve their float values byte-exact; one (the document body's sectPr) round-trips through inchesToTwips in ensureSectionLayoutDefaults and loses sub-twip precision while gaining a w:gutter="0". Fixing this requires storing the original twips alongside pageStyles.pageMargins.
  • customXml/item2.xml SharePoint FormTemplates: lost on round-trip because the XML parser doesn't preserve <?mso-contentType?> processing instructions. Affects SharePoint integration metadata. Needs a binary-passthrough path for customXml/*.
  • word/styles.xml Hyperlink style injection: a default Hyperlink style is added to styles.xml even when the source didn't have it. Likely intentional for downstream consumers; worth a separate scoping discussion.
  • docProps/app.xml + Content_Types override: SuperDoc generates app.xml like Word does on save. Benign.

The biggest source of drift (the 37 KB of redundant elements) is fixed here. The remaining items are smaller and independent.

…-trip (SD-2912)

The customer fixture round-tripped to a document.xml ~37KB larger than the
source, with 822 occurrences each of `<w:bCs/>`, `<w:iCs/>` and
`<w:highlight w:val="none"/>` injected into runs whose source rPr contained
none of them. Word treats explicit-off and absence as identical so the
rendering was unchanged, but every consumer reading the XML saw thousands
of "intentional" toggles that weren't real.

Two changes:

- `decodeRPrFromMarks` no longer auto-propagates `boldCs`/`italicCs` from
  the latin bold/italic mark. In OOXML these are independent properties
  (ECMA-376 §17.3.2). The propagation was emitting a complex-script
  companion element on every run that touched a bold or italic mark, even
  when the source rPr had no `<w:bCs/>` or `<w:iCs/>`.
- The highlight case in the same function no longer emits an explicit
  `<w:highlight w:val="none"/>` for the transparent mark. That mark is
  synthesized from `<w:shd val="clear" fill="auto"/>` shading on import
  and the shading itself round-trips independently via its own element.

To keep round-trip lossless for documents that genuinely carry `<w:bCs/>`
or `<w:iCs/>` (which were previously preserved as a side effect of the
auto-propagation), `boldCs` and `italicCs` are removed from
`RUN_PROPERTIES_DERIVED_FROM_MARKS` in calculateInlineRunPropertiesPlugin.
That moves them to the "preserve from existing runProperties" branch in
`getInlineRunProperties`, so the values captured at import time survive
appendTransactions verbatim.

Verified on the SD-2912 fixture:
- bCs:       input=0 → before=822 → after=0
- iCs:       input=0 → before=822 → after=0
- highlight: input=0 → before=822 → after=0
- document.xml: 1.13 MB → 1.08 MB (smaller than the 1.09 MB input)
- All 12714 super-editor tests pass.
@tupizz tupizz requested a review from a team as a code owner May 11, 2026 18:56
@linear
Copy link
Copy Markdown

linear Bot commented May 11, 2026

SD-2912

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: efeafe0ed2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

} else {
runProperties.highlight = { 'w:val': mark.attrs.color };
}
if (mark.attrs.color && mark.attrs.color.toLowerCase() !== 'transparent') {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve explicit w:highlight="none" overrides

The new mark.attrs.color.toLowerCase() !== 'transparent' guard drops all transparent highlight marks, but this codebase uses transparent to represent an explicit OOXML off value (getHighLightValue maps <w:highlight w:val="none"/> to transparent in the same file). That means an imported run with explicit highlight negation can no longer be reconstructed by decodeRPrFromMarks, so when run properties are recomputed (especially during run splitting/rewrites) the highlight key can be lost and style-inherited highlight may reappear in exported DOCX.

Useful? React with 👍 / 👎.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants