feat(doc,drive): normalize curly quotes and CRLF in --selection-with-ellipsis#621
feat(doc,drive): normalize curly quotes and CRLF in --selection-with-ellipsis#621herbertliu wants to merge 2 commits intomainfrom
Conversation
…ellipsis before matching User-typed --selection-with-ellipsis values frequently come from pasted prose that auto-correction has rewritten to curly quotes (U+2018/2019/ 201C/201D) or CRLF line endings, while the Lark docx store keeps punctuation in straight ASCII / LF form. Server-side matching is strict byte comparison, so a single curly quote defeats an otherwise correct selection and the Agent has to retry with guesswork (Case 7 in the lark-cli pitfall list). Add shortcuts/common.NormalizeSelectionWithEllipsis that rewrites only the transformations that are virtually always safe (curly→straight quotes and CR→LF). CJK punctuation and full-width Latin are intentionally left alone — those can legitimately appear verbatim in document prose, so rewriting them would break otherwise-valid selections. Wire it into the three call sites that send --selection-with-ellipsis to MCP: - docs +update (DryRun silent, Execute emits a 'note:' to stderr when normalization happened so the user can see what was changed) - docs +media-insert (silent — the existing locate-doc step already surfaces selection feedback on miss) - drive +add-comment Execute (emits the same 'note:' as docs +update) The helper lives in shortcuts/common so both doc and drive packages can share it without duplication.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a normalization utility for Changes
Sequence Diagram(s)sequenceDiagram
participant User as "CLI User"
participant CLI as "Command Handler"
participant Norm as "NormalizeSelectionWithEllipsis"
participant MCP as "MCP / Backend"
rect rgba(200,230,255,0.5)
User->>CLI: provide --selection-with-ellipsis
CLI->>Norm: NormalizeSelectionWithEllipsis(input)
Norm-->>CLI: normalizedSelection, changedFlag
alt changedFlag == true
CLI->>CLI: print diagnostic to stderr
end
CLI->>MCP: call with normalizedSelection
MCP-->>CLI: response
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
shortcuts/drive/drive_add_comment.go (1)
244-249: Nit: variable namenormalizedis thechangedbool, not the normalized string.This swaps the semantic roles of
selectionandnormalizedcompared to readers' expectations (and compared to the helper's own docstring which calls the second return value "a flag indicating whether any rewrite happened"). Rename tochanged(ordidNormalize) to match the helper and the sibling call indocs_update.go(line 115), which already useschanged.♻️ Proposed rename
- rawSelection := runtime.Str("selection-with-ellipsis") - selection, normalized := common.NormalizeSelectionWithEllipsis(rawSelection) - if normalized { + selection, changed := common.NormalizeSelectionWithEllipsis(runtime.Str("selection-with-ellipsis")) + if changed { fmt.Fprintf(runtime.IO().ErrOut, "note: normalized --selection-with-ellipsis (curly quotes / CR line endings rewritten to canonical ASCII form for matching)\n") }(
rawSelectionis unused after this snippet, so it can be dropped too.)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/drive/drive_add_comment.go` around lines 244 - 249, The variable naming is confusing: the second return from common.NormalizeSelectionWithEllipsis is a boolean flag indicating whether a rewrite happened, not the normalized string; rename the local `normalized` variable to `changed` (or `didNormalize`) and adjust usage so the call becomes something like `selection, changed := common.NormalizeSelectionWithEllipsis(runtime.Str("selection-with-ellipsis"))`, then use `changed` in the if-check and drop the now-unused `rawSelection` variable; reference the call to common.NormalizeSelectionWithEllipsis and the local variables `selection` and `normalized` when making the rename to match the helper's docstring and the sibling usage in docs_update.go.shortcuts/doc/doc_media_insert.go (1)
112-113: Silent normalization here is inconsistent withdocs +update/drive +add-comment.The other two call sites emit a
note: normalized --selection-with-ellipsis (...)on stderr when rewriting occurs;docs +media-insertdiscards thechangedflag with_at both sites. The PR summary flags this as intentional, but from a user's perspective the discrepancy is surprising: pasting curly quotes into--selection-with-ellipsisproduces a helpful hint for two commands and silence for the third. Consider surfacing the same note for parity — it is a one-line stderr print and does not affect stdout JSON.♻️ Proposed diff (Execute site, line 254)
- selection, _ := common.NormalizeSelectionWithEllipsis(strings.TrimSpace(runtime.Str("selection-with-ellipsis"))) + selection, selNormalized := common.NormalizeSelectionWithEllipsis(strings.TrimSpace(runtime.Str("selection-with-ellipsis"))) + if selNormalized { + fmt.Fprintf(runtime.IO().ErrOut, + "note: normalized --selection-with-ellipsis (curly quotes / CR line endings rewritten to canonical ASCII form for matching)\n") + }Also applies to: 254-254
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/doc_media_insert.go` around lines 112 - 113, The code in docs +media-insert silently drops the normalization "changed" flag (call to common.NormalizeSelectionWithEllipsis assigns to selection, _), causing it to not emit the stderr note other commands do; change the assignment to capture the boolean (e.g., selection, changed := common.NormalizeSelectionWithEllipsis(...)) and, after assignment, if changed then print the same stderr note used by docs +update / drive +add-comment (a one-line note about normalized --selection-with-ellipsis) so behavior is consistent.shortcuts/doc/docs_update.go (1)
80-94: DryRun body diverges from Execute when normalization changes the input.Execute sends the normalized value and prints a stderr note, but DryRun silently rewrites with
_. Since dry-run is meant to show exactly what would be sent, that is consistent — however, a user passing curly quotes in--dry-runsees a rewritten payload with no indication that normalization occurred. Consider emitting the same stderr note in DryRun (stderr, so it doesn't corrupt stdout JSON), or at minimumSet("selection_normalized", true)on the dry-run envelope so the rewrite is visible to automation. Optional.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/docs_update.go` around lines 80 - 94, The dry-run path currently rewrites selection-with-ellipsis via common.NormalizeSelectionWithEllipsis but emits no indication of that change; update the DryRun payload builder (the code using common.NewDryRunAPI(), POST(...), Body(...).Set("mcp_tool","update-doc").Set("args", args)) to signal normalization—after calling common.NormalizeSelectionWithEllipsis and assigning args["selection_with_ellipsis"]=normalized, if the normalized value differs from the original runtime.Str("selection-with-ellipsis") then either write the same stderr note used by Execute or (simpler) call Set("selection_normalized", true) on the dry-run envelope so consumers see the rewrite. Ensure you reference the runtime.Str("selection-with-ellipsis") original string to compare against the normalized value before setting the flag.shortcuts/common/selection_normalize.go (1)
27-38: Optional: collapse the six passes into a singlestrings.NewReplacer.Each
strings.ReplaceAllperforms a full scan/allocation; a singlestrings.NewReplacer(...).Replace(s)does one pass and is the idiomatic Go choice for multi-literal substitution. Given these inputs are typically short, this is purely a readability/micro-perf nit.♻️ Proposed refactor
- out := s - // Curly single quotes → ASCII apostrophe. - out = strings.ReplaceAll(out, "\u2018", "'") - out = strings.ReplaceAll(out, "\u2019", "'") - // Curly double quotes → ASCII double quote. - out = strings.ReplaceAll(out, "\u201C", "\"") - out = strings.ReplaceAll(out, "\u201D", "\"") - // CRLF / standalone CR → LF. Lark stores LF internally; sending CRLF in - // a selection would require the document to contain literal CR bytes, - // which it never does. - out = strings.ReplaceAll(out, "\r\n", "\n") - out = strings.ReplaceAll(out, "\r", "\n") + // Order matters: "\r\n" must precede "\r" so CRLF collapses to a single LF. + out := selectionNormalizer.Replace(s)And at package scope:
var selectionNormalizer = strings.NewReplacer( "\u2018", "'", "\u2019", "'", "\u201C", "\"", "\u201D", "\"", "\r\n", "\n", "\r", "\n", )
strings.NewReplacerapplies replacements in argument order, giving earlier patterns precedence when they match at the same position, so\r\n(appearing first) takes precedence over\r— equivalent to the current two-step CR/LF handling.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/common/selection_normalize.go` around lines 27 - 38, Replace the six separate strings.ReplaceAll calls in selection_normalize.go with a single strings.NewReplacer to avoid multiple full-string scans; define a package-scope variable (e.g., selectionNormalizer) initialized with the pairs "\u2018","'", "\u2019","'", "\u201C","\"", "\u201D","\"", "\r\n","\n", "\r","\n" (ensure "\r\n" appears before "\r" to preserve precedence) and change the function that currently sets out := s and calls ReplaceAll to use selectionNormalizer.Replace(s) instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/common/selection_normalize_test.go`:
- Around line 23-28: The test case name "ascii-only selection is untouched" in
selection_normalize_test.go is misleading because the input is CJK; rename that
table entry's name to something like "cjk-only selection is untouched" to
reflect the input, and also add a separate genuine ASCII-only no-op case (e.g.,
input "hello world", want "hello world", wantChanged false) to the same test
table so ASCII coverage is explicit; update only the table entry's name and/or
add the new case in the test data (no other code logic changes).
---
Nitpick comments:
In `@shortcuts/common/selection_normalize.go`:
- Around line 27-38: Replace the six separate strings.ReplaceAll calls in
selection_normalize.go with a single strings.NewReplacer to avoid multiple
full-string scans; define a package-scope variable (e.g., selectionNormalizer)
initialized with the pairs "\u2018","'", "\u2019","'", "\u201C","\"",
"\u201D","\"", "\r\n","\n", "\r","\n" (ensure "\r\n" appears before "\r" to
preserve precedence) and change the function that currently sets out := s and
calls ReplaceAll to use selectionNormalizer.Replace(s) instead.
In `@shortcuts/doc/doc_media_insert.go`:
- Around line 112-113: The code in docs +media-insert silently drops the
normalization "changed" flag (call to common.NormalizeSelectionWithEllipsis
assigns to selection, _), causing it to not emit the stderr note other commands
do; change the assignment to capture the boolean (e.g., selection, changed :=
common.NormalizeSelectionWithEllipsis(...)) and, after assignment, if changed
then print the same stderr note used by docs +update / drive +add-comment (a
one-line note about normalized --selection-with-ellipsis) so behavior is
consistent.
In `@shortcuts/doc/docs_update.go`:
- Around line 80-94: The dry-run path currently rewrites selection-with-ellipsis
via common.NormalizeSelectionWithEllipsis but emits no indication of that
change; update the DryRun payload builder (the code using common.NewDryRunAPI(),
POST(...), Body(...).Set("mcp_tool","update-doc").Set("args", args)) to signal
normalization—after calling common.NormalizeSelectionWithEllipsis and assigning
args["selection_with_ellipsis"]=normalized, if the normalized value differs from
the original runtime.Str("selection-with-ellipsis") then either write the same
stderr note used by Execute or (simpler) call Set("selection_normalized", true)
on the dry-run envelope so consumers see the rewrite. Ensure you reference the
runtime.Str("selection-with-ellipsis") original string to compare against the
normalized value before setting the flag.
In `@shortcuts/drive/drive_add_comment.go`:
- Around line 244-249: The variable naming is confusing: the second return from
common.NormalizeSelectionWithEllipsis is a boolean flag indicating whether a
rewrite happened, not the normalized string; rename the local `normalized`
variable to `changed` (or `didNormalize`) and adjust usage so the call becomes
something like `selection, changed :=
common.NormalizeSelectionWithEllipsis(runtime.Str("selection-with-ellipsis"))`,
then use `changed` in the if-check and drop the now-unused `rawSelection`
variable; reference the call to common.NormalizeSelectionWithEllipsis and the
local variables `selection` and `normalized` when making the rename to match the
helper's docstring and the sibling usage in docs_update.go.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a130595e-8de0-40f2-8e55-237d868ad6c0
📒 Files selected for processing (5)
shortcuts/common/selection_normalize.goshortcuts/common/selection_normalize_test.goshortcuts/doc/doc_media_insert.goshortcuts/doc/docs_update.goshortcuts/drive/drive_add_comment.go
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #621 +/- ##
=======================================
Coverage 60.05% 60.05%
=======================================
Files 406 407 +1
Lines 43089 43116 +27
=======================================
+ Hits 25877 25895 +18
- Misses 15194 15202 +8
- Partials 2018 2019 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🚀 PR Preview Install Guide🧰 CLI updatenpm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@e7b7446ae347f1d7ff43a534775c5ce1bfe156b5🧩 Skill updatenpx skills add larksuite/cli#feat/doc-selection-normalization -y -g |
CodeRabbit pointed out the 'ascii-only selection is untouched' case
actually had a CJK input ('欢迎大家多给反馈'), which was misleading.
Rename that entry to 'cjk-only selection is untouched' and add a
separate ASCII-only case with 'hello world' so the table now covers
both no-op scripts explicitly.
Summary
Addresses Case 7 in the lark-cli pitfall list: user-typed
--selection-with-ellipsisvalues frequently come from pasted prose where auto-correction has rewritten straight quotes to curly quotes (U+2018/2019/201C/201D) or inserted CRLF. The Lark docx store keeps punctuation canonical (straight ASCII, LF), and server-side matching is strict byte comparison — a single curly quote defeats an otherwise correct selection. Agents currently have to retry with guesswork.Normalize the deterministic-safe subset at the CLI layer before sending to MCP.
Changes
shortcuts/common/selection_normalize.go(new):NormalizeSelectionWithEllipsisrewrites curly quotes → straight, CRLF / CR → LF; returns(normalized, changed). CJK punctuation and full-width Latin are deliberately left alone since those appear verbatim in real document prose.shortcuts/doc/docs_update.go: apply in both DryRun and Execute paths; Execute emits anote:to stderr when normalization actually changed the input so the user sees what was rewritten.shortcuts/doc/doc_media_insert.go: apply silently at the two call sites (existinglocate-docstep surfaces feedback on miss).shortcuts/drive/drive_add_comment.go: apply in dry-run and Execute; Execute emits the samenote:.shortcuts/common/selection_normalize_test.go: 10-case table-driven coverage (empty, ASCII-only, curly singles/doubles, CRLF, standalone CR, CJK punctuation left alone, full-width Latin left alone).Scope notes
--selection-by-titleis not normalized — it's parsed as markdown heading syntax, and the ATX heading check from PR feat(doc): warn when replace_range targets a heading but markdown lacks matching heading #619 already handles the conservative whitespace cases there.Test Plan
go test ./shortcuts/common/...passes (new helper tests)go test ./shortcuts/...passesgo vet ./...cleangofmt -lcleanSummary by CodeRabbit
New Features
Improvements