fix(ci): place "(backported from commit X)" before trailers in backport message#4696
Merged
Yicong-Huang merged 4 commits intoapache:mainfrom May 3, 2026
Merged
Conversation
…rt message Direct backports currently append the "(backported from commit X)" note at the very end of the commit message, which pushes it after the Co-Authored-By trailer block. Git's trailer parsing treats the message as having no trailers in that case, so tools that look at trailers (tags, hooks, anything calling `git interpret-trailers --parse`) see zero authors. Compose the message so the body comes first, then the "(backported from commit X)" note as its own paragraph, then the trailer block (Co-Authored-By, Signed-off-by, etc.) at the very end. A small Python step splits the original message at the trailer boundary using the standard `Key:\s` heuristic, then reassembles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4696 +/- ##
============================================
- Coverage 42.12% 41.67% -0.46%
- Complexity 2001 2091 +90
============================================
Files 957 978 +21
Lines 34094 35415 +1321
Branches 3753 3914 +161
============================================
+ Hits 14363 14759 +396
- Misses 18954 19866 +912
- Partials 777 790 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…ssified The first version scanned from the end and treated any contiguous run of `Key: value` lines as the trailer block. That mis-classifies a Conventional Commits subject like "docs: typo fix" (when the message has only a subject and no body) as a one-line trailer block, putting the backport note above the subject. Tighten the detection: a trailer block exists only if there is a blank line in the message and every line after the LAST blank line matches the trailer format. With that rule a subject without a body falls through to the no-trailer-block branch and the note is appended after the subject. Verified locally on five message shapes: - subject only, - subject + body without trailers, - subject + body + single trailer, - subject + body + multiple trailers, - body containing a `Word:` line that is not a trailer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aglinxinyuan
approved these changes
May 2, 2026
Yicong-Huang
added a commit
that referenced
this pull request
May 3, 2026
…rt message (#4696) ### What changes were proposed in this PR? Direct backports currently append the `(backported from commit X)` note at the very end of the commit message. Because GitHub's squash-merge messages already carry a trailer block (`Co-Authored-By:` etc.) at the end, the note ends up *after* the trailers: ``` fix: foo Body. Closes #123. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> (backported from commit abc1234) ← currently here ``` Git treats the trailer block as the contiguous run of `Key: value` lines at the very end of the message, separated from the body by a blank line. With a non-trailer line trailing the trailer block, `git interpret-trailers --parse` sees zero trailers — tools that read Co-Authored-By, Signed-off-by, etc. lose the metadata for backport commits. This PR re-orders the composition so the body comes first, then the `(backported from commit X)` note as its own paragraph, then the trailer block at the very end: ``` fix: foo Body. Closes #123. (backported from commit abc1234) ← moved here Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> ``` A small Python step inside the workflow splits the original message at the trailer boundary using the standard `^[A-Za-z][A-Za-z0-9-]*:\s` heuristic and reassembles. Cases handled: | input | output | |---|---| | body + trailers | body + blank + note + blank + trailers | | body, no trailers | body + blank + note | | multiple trailers | body + blank + note + blank + all trailers contiguous | ### Any related issues, documentation, discussions? None — incidental fix to the backport job alongside the broader CI cleanup work. ### How was this PR tested? Ran the assembled Python step locally against five synthetic message shapes: | input | result | |---|---| | subject only (`docs: typo fix`) | subject + blank + note | | subject + body, no trailers | subject + body + blank + note | | subject + body + single trailer | subject + body + blank + note + blank + trailer | | subject + body + multiple trailers | subject + body + blank + note + blank + all trailers contiguous | | body containing a non-trailer `Word:` line | subject + body + blank + note (body line not mis-classified) | The last case is the trailer-detection tightening: a trailer block exists only when there is a blank line in the message AND every line after the last blank line is in trailer format. That avoids treating a Conventional Commits subject (`feat: foo` for a one-line commit) as a one-line trailer block. The next direct-backport CI run on this branch's merge will exercise the path end-to-end. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.7, 1M context) --------- (backported from commit a5b8957) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Yicong-Huang
added a commit
that referenced
this pull request
May 3, 2026
…mments (#4846) ### What changes were proposed in this PR? Three changes to `.github/workflows/direct-backport-push.yml`. **1. Repair YAML.** The inline `python3 -c '<source>'` from #4696 put Python at column 0 inside a `run: |` block indented at column 10. YAML treats `import re, sys` as a top-level key, so every push to `main` failed in 0 seconds with 0 jobs (e.g. [run 25271247473](https://github.com/apache/texera/actions/runs/25271247473)). Python can't be re-indented (top-level statements reject leading whitespace), so the script moves to `.github/scripts/compose-backport-message.py`. Behavior unchanged. **2. Surface backport status on the original commit + PR.** Cherry-picks produce a new SHA, so the release branch never appears in the auto-derived branch badge on the main commit. Three channels instead — commit status badge, commit comment, PR comment — on success; commit status + PR comment on failure with an inline conflict diagnosis. Success PR comment: > Backport to [`release/0.4`](…/tree/release/0.4) succeeded as [`a1b2c3d`](…/commit/a1b2c3d…). [Run](…) Failure PR comment (when cherry-pick conflicts): > Backport to `release/0.4` failed. See [job log](…/job/…). > > **Conflicts in:** > - `f.txt` > > **Likely-missing prerequisites on main** (commits that touched these files between merge-base `6343a1bc` and `c027f3b2^` — consider backporting these first): > - `958b8e8 main: prereq edit f` Capped at 5 files / 10 commits; full detail stays in the job log. Rebase-race conflicts get the same shape but list the racing commits on `origin/<target>` instead. **3. Retry + structured logging.** `git push` retries 5x with `[0, 5, 15, 30, 60]s` backoff and rebases on `origin/<target>` between attempts to absorb push races. Annotation API calls retry with `[0, 2, 5, 15]s` and degrade to warnings on final failure (a 5xx on a comment shouldn't undo a successful cherry-pick). Every phase is wrapped in `::group::` markers with a `[backport <target>] ...` prefix. ### Any related issues, documentation, discussions? Fixes the regression introduced in #4696. ### How was this PR tested? `yaml.safe_load` parses the workflow. `compose-backport-message.py` round-trips through `git interpret-trailers --parse` with `Co-authored-by` preserved. The conflict diagnosis output above came verbatim from a throwaway repo where main introduces a prerequisite edit + feature commit and the release branch touches the same lines. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.7, 1M context) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Yicong-Huang
added a commit
that referenced
this pull request
May 3, 2026
…mments (#4846) ### What changes were proposed in this PR? Three changes to `.github/workflows/direct-backport-push.yml`. **1. Repair YAML.** The inline `python3 -c '<source>'` from #4696 put Python at column 0 inside a `run: |` block indented at column 10. YAML treats `import re, sys` as a top-level key, so every push to `main` failed in 0 seconds with 0 jobs (e.g. [run 25271247473](https://github.com/apache/texera/actions/runs/25271247473)). Python can't be re-indented (top-level statements reject leading whitespace), so the script moves to `.github/scripts/compose-backport-message.py`. Behavior unchanged. **2. Surface backport status on the original commit + PR.** Cherry-picks produce a new SHA, so the release branch never appears in the auto-derived branch badge on the main commit. Three channels instead — commit status badge, commit comment, PR comment — on success; commit status + PR comment on failure with an inline conflict diagnosis. Success PR comment: > Backport to [`release/0.4`](…/tree/release/0.4) succeeded as [`a1b2c3d`](…/commit/a1b2c3d…). [Run](…) Failure PR comment (when cherry-pick conflicts): > Backport to `release/0.4` failed. See [job log](…/job/…). > > **Conflicts in:** > - `f.txt` > > **Likely-missing prerequisites on main** (commits that touched these files between merge-base `6343a1bc` and `c027f3b2^` — consider backporting these first): > - `958b8e8 main: prereq edit f` Capped at 5 files / 10 commits; full detail stays in the job log. Rebase-race conflicts get the same shape but list the racing commits on `origin/<target>` instead. **3. Retry + structured logging.** `git push` retries 5x with `[0, 5, 15, 30, 60]s` backoff and rebases on `origin/<target>` between attempts to absorb push races. Annotation API calls retry with `[0, 2, 5, 15]s` and degrade to warnings on final failure (a 5xx on a comment shouldn't undo a successful cherry-pick). Every phase is wrapped in `::group::` markers with a `[backport <target>] ...` prefix. ### Any related issues, documentation, discussions? Fixes the regression introduced in #4696. ### How was this PR tested? `yaml.safe_load` parses the workflow. `compose-backport-message.py` round-trips through `git interpret-trailers --parse` with `Co-authored-by` preserved. The conflict diagnosis output above came verbatim from a throwaway repo where main introduces a prerequisite edit + feature commit and the release branch touches the same lines. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.7, 1M context) --------- (backported from commit af5d174) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bobbai00
added a commit
to bobbai00/texera
that referenced
this pull request
May 3, 2026
* ci: nightly strict license-binary check that files a tracking issue on drift Resolves apache#4692. PR builds run check_binary_deps.py with --ignore-transitive-version (apache#4693) so a benign upstream version bump on a transitive dep does not block merges. This workflow runs the same checks **without** that flag every night on `main` so transitive drift is still visible and actionable before each release. On non-zero exit it files (or updates) one tracking issue identified by the stable label `license-binary-drift`; on a clean run it closes the issue if one is open. Workflow shape: - frontend-npm | agent-npm | python | jar — one job per ecosystem, each rebuilds its dist exactly the way build.yml does and runs the strict check; failures don't fail the workflow (continue-on-error) so all four still run. - jar uses the unified check across every dist's lib/ rather than a per-service matrix; per-service placement errors are still caught by build.yml on every PR, and the nightly's job is exact-version drift which the unified check surfaces just as well. - report — aggregates per-ecosystem results from artifacts and creates / updates / closes the tracking issue via actions/github-script. Skips issue management when not on the default branch (so workflow_dispatch on feature branches still runs the checks but does not surface issues). Trigger: schedule (07:00 UTC daily) + workflow_dispatch. Permissions: issues:write for the report job. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): repair direct-backport-push YAML and post backport result comments (apache#4846) ### What changes were proposed in this PR? Three changes to `.github/workflows/direct-backport-push.yml`. **1. Repair YAML.** The inline `python3 -c '<source>'` from apache#4696 put Python at column 0 inside a `run: |` block indented at column 10. YAML treats `import re, sys` as a top-level key, so every push to `main` failed in 0 seconds with 0 jobs (e.g. [run 25271247473](https://github.com/apache/texera/actions/runs/25271247473)). Python can't be re-indented (top-level statements reject leading whitespace), so the script moves to `.github/scripts/compose-backport-message.py`. Behavior unchanged. **2. Surface backport status on the original commit + PR.** Cherry-picks produce a new SHA, so the release branch never appears in the auto-derived branch badge on the main commit. Three channels instead — commit status badge, commit comment, PR comment — on success; commit status + PR comment on failure with an inline conflict diagnosis. Success PR comment: > Backport to [`release/0.4`](…/tree/release/0.4) succeeded as [`a1b2c3d`](…/commit/a1b2c3d…). [Run](…) Failure PR comment (when cherry-pick conflicts): > Backport to `release/0.4` failed. See [job log](…/job/…). > > **Conflicts in:** > - `f.txt` > > **Likely-missing prerequisites on main** (commits that touched these files between merge-base `6343a1bc` and `c027f3b2^` — consider backporting these first): > - `958b8e8 main: prereq edit f` Capped at 5 files / 10 commits; full detail stays in the job log. Rebase-race conflicts get the same shape but list the racing commits on `origin/<target>` instead. **3. Retry + structured logging.** `git push` retries 5x with `[0, 5, 15, 30, 60]s` backoff and rebases on `origin/<target>` between attempts to absorb push races. Annotation API calls retry with `[0, 2, 5, 15]s` and degrade to warnings on final failure (a 5xx on a comment shouldn't undo a successful cherry-pick). Every phase is wrapped in `::group::` markers with a `[backport <target>] ...` prefix. ### Any related issues, documentation, discussions? Fixes the regression introduced in apache#4696. ### How was this PR tested? `yaml.safe_load` parses the workflow. `compose-backport-message.py` round-trips through `git interpret-trailers --parse` with `Co-authored-by` preserved. The conflict diagnosis output above came verbatim from a throwaway repo where main introduces a prerequisite edit + feature commit and the release branch touches the same lines. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.7, 1M context) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: reuse build.yml for nightly via ignore_transitive_version input Per review on apache#4734: instead of duplicating build.yml's dist-producing steps in the nightly workflow, parametrize build.yml with a new `ignore_transitive_version` input and have the nightly call it as a reusable workflow with that input flipped to false. PR builds keep the default (true). This guarantees PR and nightly runs go through identical code paths — the only difference between them is the value of one input. Changes: - build.yml: add `ignore_transitive_version: boolean = true` input. Replace each of the 6 hard-coded `--ignore-transitive-version` flags (frontend/amber/platform/python/agent-service license checks) with `${{ inputs.ignore_transitive_version && '--ignore-transitive-version' || '' }}`. The platform job's check previously didn't pass the flag at all (strict on PRs); this commit unifies it with the rest so all five ecosystems behave the same: relaxed on PRs, strict on nightly. - license-binary-nightly.yml: drop the per-ecosystem job copies. The workflow now has just two jobs: - build: `uses: ./.github/workflows/build.yml` with `ignore_transitive_version: false`, `secrets: inherit`. - report: walks the current run's jobs via listJobsForWorkflowRun, identifies license-check step failures (regex matches step names containing "license-binary" or "binary licenses"), and creates / updates / closes the tracking issue accordingly. Non-license step failures (flaky tests, network blips) are ignored so they don't spuriously surface as drift. The report step's six branches (drift+new, drift+existing, clean+existing, clean+nothing, non-license-failure-only, default-branch guard) were exercised end-to-end with stubbed github/context/core under Node before push. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflow-core): add unit test coverage for VFSURIFactory (apache#4757) ### What changes were proposed in this PR? Add `VFSURIFactorySpec` covering URI construction and decoding in `VFSURIFactory`: - `createResultURI` includes wid/eid/globalportid and the result resource type - Result URIs round-trip through `decodeURI` - `createRuntimeStatisticsURI` omits the `opid/` segment - `createConsoleMessagesURI` embeds the operator id and the `consoleMessages` resource type - `decodeURI` rejects non-vfs schemes, URIs missing required segments, and unknown resource-type tails ### Any related issues, documentation, discussions? Closes apache#4756 ### How was this PR tested? `sbt "WorkflowCore/testOnly org.apache.texera.amber.core.storage.VFSURIFactorySpec"` — 7/7 tests pass. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Claude Opus 4.7) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: rename ignore_transitive_version input to mode (PR | release) Per review on apache#4734: replace the boolean input with a string "mode" so the call sites name *what* they are (PR-time relaxed vs. release-time strict) instead of *what flag they pass*. build.yml: inputs.mode: string, default "PR" "PR" -> --ignore-transitive-version (relaxed) "release" -> no flag (strict exact-match) The five license-check invocations now read ${{ inputs.mode == 'PR' && '--ignore-transitive-version' || '' }} so any value other than "PR" falls through to strict, which is the safer side. workflow_call inputs cannot enforce string enums; the valid values are documented inline. license-binary-nightly.yml: Pass `mode: release` instead of `ignore_transitive_version: false`. Updated the inline comment + tracking-issue body wording to match. required-checks.yml is unchanged: it doesn't pass this input, so PR builds keep the default ("PR") and behave exactly as before. Re-ran the three representative report scenarios (drift+new, clean+existing, non-license failure only) under Node with stubbed github/context/core; all three still behave correctly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci(nightly): move schedule to 11:00 UTC (04:00 PDT / 03:00 PST) Per review on apache#4734: 07:00 UTC was midnight PDT, when many people are still working. Move to 11:00 UTC so it lands outside US-Pacific working hours. GitHub cron is fixed UTC; the local clock-time shifts by an hour at DST transitions. Daily cadence is fine for now; if it turns out to be too frequent we can drop to every 48–72 h. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Yicong Huang <17627829+Yicong-Huang@users.noreply.github.com> Co-authored-by: Xinyuan Lin <xinyual3@uci.edu>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this PR?
Direct backports currently append the
(backported from commit X)note at the very end of the commit message. Because GitHub's squash-merge messages already carry a trailer block (Co-Authored-By:etc.) at the end, the note ends up after the trailers:Git treats the trailer block as the contiguous run of
Key: valuelines at the very end of the message, separated from the body by a blank line. With a non-trailer line trailing the trailer block,git interpret-trailers --parsesees zero trailers — tools that read Co-Authored-By, Signed-off-by, etc. lose the metadata for backport commits.This PR re-orders the composition so the body comes first, then the
(backported from commit X)note as its own paragraph, then the trailer block at the very end:A small Python step inside the workflow splits the original message at the trailer boundary using the standard
^[A-Za-z][A-Za-z0-9-]*:\sheuristic and reassembles. Cases handled:Any related issues, documentation, discussions?
Follow up to #4614
How was this PR tested?
Ran the assembled Python step locally against five synthetic message shapes:
docs: typo fix)Word:lineThe last case is the trailer-detection tightening: a trailer block exists only when there is a blank line in the message AND every line after the last blank line is in trailer format. That avoids treating a Conventional Commits subject (
feat: foofor a one-line commit) as a one-line trailer block.The next direct-backport CI run on this branch's merge will exercise the path end-to-end.
Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.7, 1M context)