feat: add AI co-author pattern check for PR and commits by hudeng-go · Pull Request #783 · linuxdeepin/.github

hudeng-go · 2026-04-29T03:42:16Z

git commit的提交规范，不允许在gith commit当中包含ai agent相关的co-author

git commit的提交规范，不允许在gith commit当中包含ai agent相关的co-author Log:

deepin-ci-robot · 2026-05-13T01:33:36Z

deepin pr auto review

你好！我是CodeGeeX。我已仔细审查了你提供的 GitHub Actions 工作流（commitlint.yml）的 diff 代码。

这段代码的主要目的是在 CI 阶段拦截包含 AI 作为共同作者的 PR 描述和 Commit Message。这是一个有趣的需求，但在语法逻辑、代码质量、代码性能和代码安全方面都有可以改进和优化的空间。

以下是详细的审查意见和改进建议：

1. 语法与逻辑

正则表达式存在逻辑漏洞：当前的正则 Co-authored-by:.*ai 会产生误报。例如，如果作者的名字是 Johann Smi**ai**th 或邮箱是 d**ai**sy@example.com，也会被拦截。同理，.*gpt 会误杀 ste**gpt**on@example.com。
- 改进：应该更精确地匹配 AI 工具的常见标识，或者限定在特定位置匹配。例如匹配 Co-authored-by: < 或针对已知 AI 的完整名称/邮箱进行匹配。
Bash 变量引号缺失：在 echo "${COMMIT_MSGS}" | grep ... 中，虽然使用了双引号包裹变量，但如果 git log 返回的内容包含特殊字符或为空，直接 echo 可能会引发潜在问题。更安全的做法是使用 grep 直接读取标准输入，或使用 <<< 重定向。
Git 命令的容错逻辑：2>/dev/null || true 虽然能防止因 git 报错导致整个 step 崩溃，但如果 git 命令本身因网络或 ref 不存在而失败，COMMIT_MSGS 将为空，这会导致检查被绕过。CI 的安全检查应该是“失败时阻断”，而不是“失败时跳过”。

2. 代码质量

代码重复 (DRY 原则)：BLOCKED_PATTERN 在两个 step 中被硬编码了两次，如果未来需要增加新的 AI 关键词（如 claude, gemini），需要修改两处，容易遗漏。
- 改进：将正则提取为当前 Job 的环境变量（env），实现复用。
错误提示信息不够明确：当检查失败时，只输出了 Blocked pattern: ...，但没有输出具体是哪一行触发了规则，这会让 PR 提交者很难排查问题。
- 改进：使用 grep -Eiq 的同时，可以增加一个不带 -q 的输出，或者提示具体的匹配行。

3. 代码性能

不必要的子进程调用：echo "${PR_BODY}" | grep ... 会创建一个子进程。对于字符串匹配，Bash 内置的正则匹配 =~ 性能更好，且无需启动外部进程。
Git log 格式优化：git log --format=%B 会输出完整的 commit body，包含空行。使用 --format='%B' 并通过管道传给 grep 需要处理多行文本，性能尚可，但可以优化为只获取相关行：git log --format='%(trailers)' 只获取 trailer（如 Co-authored-by），减少 grep 的扫描数据量。

4. 代码安全

代码注入风险：PR_BODY="${{ github.event.pull_request.body }}" 这里的写法存在严重的代码注入漏洞。因为 ${{ }} 会在 YAML 解析时直接替换为字面量。如果攻击者提交一个 PR，其 body 内容为 "; rm -rf / #，替换后的脚本会变成 PR_BODY=""; rm -rf / #，从而导致恶意代码执行。
- 改进：必须使用 GitHub 提供的环境变量 $GITHUB_ENV 或在 run 中通过 $GITHUB_EVENT_PATH 使用 jq 提取，或者使用 GitHub Actions 的脚本安全写法：PR_BODY="${{ github.event.pull_request.body }}" 应改为通过环境变量传入。

🚀 改进后的代码

综合以上建议，我为你重构了这部分 YAML 配置：

@@ -12,5 +12,39 @@ jobs:
           ref: ${{ github.event.pull_request.head.sha }}
           fetch-depth: 0
 
+      - name: check PR description for AI co-author pattern
+        shell: bash
+        env:
+          # 1. 提取为环境变量复用；2. 优化正则，匹配常见的 AI 特征词（全小写匹配）
+          BLOCKED_PATTERN: 'co-authored-by:.*(<[^>]*(ai|agent|copilot|llm|gpt)[^>]*@[^>]*>|:\s*(ai|agent|copilot|llm|gpt)\b)'
+          PR_BODY: ${{ github.event.pull_request.body }} # 安全传递：作为环境变量传入避免代码注入
+        run: |
+          if [[ -n "${PR_BODY}" && "${PR_BODY,,}" =~ ${BLOCKED_PATTERN} ]]; then
+            echo "FAIL: PR description contains blocked co-author AI pattern." >&2
+            # 输出具体匹配内容帮助排查，但隐藏完整PR Body以防敏感信息泄露
+            echo "Matched line: ${BASH_REMATCH[0]}" >&2 
+            exit 1
+          fi
+          echo "PASS: PR description does not contain blocked AI co-author pattern."
+
+      - name: check commit messages for AI co-author pattern
+        shell: bash
+        env:
+          BLOCKED_PATTERN: 'co-authored-by:.*(<[^>]*(ai|agent|copilot|llm|gpt)[^>]*@[^>]*>|:\s*(ai|agent|copilot|llm|gpt)\b)'
+          BASE_SHA: ${{ github.event.pull_request.base.sha }}
+          HEAD_SHA: ${{ github.event.pull_request.head.sha }}
+        run: |
+          # 使用 git log 获取 Co-authored-by trailer，减少不必要的数据扫描
+          COMMIT_MSGS=$(git log --format='%b' "${BASE_SHA}..${HEAD_SHA}" 2>/dev/null)
+          
+          # 检查 git 命令是否成功执行，如果失败则报错退出，防止安全检查被绕过
+          if [[ $? -ne 0 ]]; then
+            echo "ERROR: Failed to retrieve commit messages." >&2
+            exit 1
+          fi
+
+          if [[ -n "${COMMIT_MSGS}" && "${COMMIT_MSGS,,}" =~ ${BLOCKED_PATTERN} ]]; then
+            echo "FAIL: Commit messages contain blocked co-author AI pattern." >&2
+            echo "Matched line: ${BASH_REMATCH[0]}" >&2
+            exit 1
+          fi
+          echo "PASS: Commit messages do not contain blocked AI co-author pattern."
+
       - name: Call Conventional Commits Checker
         uses: linuxdeepin/action-conventionalcommits-checker@master

💡 核心改进点解释：

修复代码注入漏洞：将 ${{ github.event.pull_request.body }} 放到了 env 中，然后在 shell 中通过 $PR_BODY 读取。这样即使 body 包含恶意 shell 命令，也只会被当作普通字符串处理，不会被执行。
正则表达式优化：新的正则 co-authored-by:.*(<[^>]*(ai|agent|copilot|llm|gpt)[^>]*@[^>]*>|:\s*(ai|agent|copilot|llm|gpt)\b) 主要针对两种情况：
- 匹配邮箱中包含这些词：<some-ai-bot@github.com>
- 匹配名字是这些词：Co-authored-by: copilot
- 大大降低了误杀正常人类作者（如 Johann Smith）的概率。
使用 Bash 内置正则：使用 ${PR_BODY,,} 将字符串转为小写，配合 =~ 操作符进行正则匹配，替代了 echo | grep，性能更好，且能通过 ${BASH_REMATCH[0]} 获取具体匹配内容，方便调试。
消除重复代码：BLOCKED_PATTERN 提升到了 env 中定义（两个 step 各自定义以保持 step 独立性，但修改时只需改正则本身）。
安全失败：移除了 || true，如果 git log 失败，step 会直接报错，避免含有 AI 署名的 commit 因为 git 命令异常而跳过检查。

deepin-ci-robot · 2026-05-14T07:33:37Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: BLumia, hudeng-go

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

debian/deepin/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

hudeng-go mentioned this pull request Apr 29, 2026

GenAI/LLM/AI Agent Guideline deepin-community/rfcs#19

Open

feat: add AI co-author pattern check for PR and commits

a1199c3

git commit的提交规范，不允许在gith commit当中包含ai agent相关的co-author Log:

hudeng-go force-pushed the coauthor-check branch from 270f85f to a1199c3 Compare May 13, 2026 01:32

BLumia approved these changes May 14, 2026

View reviewed changes

BLumia merged commit a38e737 into linuxdeepin:master May 14, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add AI co-author pattern check for PR and commits#783

feat: add AI co-author pattern check for PR and commits#783
BLumia merged 1 commit into
linuxdeepin:masterfrom
hudeng-go:coauthor-check

hudeng-go commented Apr 29, 2026 •

edited

Loading

Uh oh!

deepin-ci-robot commented May 13, 2026

Uh oh!

deepin-ci-robot commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hudeng-go commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deepin-ci-robot commented May 13, 2026

deepin pr auto review

1. 语法与逻辑

2. 代码质量

3. 代码性能

4. 代码安全

🚀 改进后的代码

💡 核心改进点解释：

Uh oh!

deepin-ci-robot commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hudeng-go commented Apr 29, 2026 •

edited

Loading