Skip to content

feat: add weekly release notes generator script#21

Open
Aadil-5122 wants to merge 1 commit intomainfrom
vorflux/add-release-notes-script
Open

feat: add weekly release notes generator script#21
Aadil-5122 wants to merge 1 commit intomainfrom
vorflux/add-release-notes-script

Conversation

@Aadil-5122
Copy link
Copy Markdown
Contributor

Summary

Adds generate_release_notes.py -- a script that scans merged PRs across all HydraDB repositories for a configurable time window and generates categorized release notes in markdown.

Repos scanned

cortex-application, cortex-ingestion, cortex-dashboard, hydradb-on-prem-infra, hydradb-cli, hydradb-mcp, hydradb-claude-code, hydradb-bench, python-sdk, ts-sdk, mintlify-docs, docs, openclaw-hydradb

Features

  • Automatic PR categorization (Features, Bug Fixes, Performance, Security, Infrastructure, Documentation, Chores)
  • Contributor stats with bot vs. human breakdown
  • Optional AI executive summary via OpenAI (--dry-run to skip)
  • Output to reports/release-notes-YYYY-MM-DD.md

Usage

source .venv/bin/activate
python generate_release_notes.py --days 7            # with AI summary
python generate_release_notes.py --days 7 --dry-run  # without AI summary

Required env vars

  • GITHUB_TOKEN -- repo read access
  • OPENAI_API_KEY -- optional, for AI summarization

Testing

  • Ran python generate_release_notes.py --days 7 --dry-run successfully
  • Generated reports/release-notes-2026-04-17.md with 63 PRs across 6 active repos
  • Verified categorization, contributor stats, and markdown formatting

Scans merged PRs across all HydraDB repos (cortex-application,
cortex-ingestion, cortex-dashboard, hydradb-on-prem-infra, hydradb-cli,
mintlify-docs, and others) for a configurable time window.

Features:
- Automatic PR categorization (features, fixes, perf, security, etc.)
- Contributor stats
- Optional AI summarization via OpenAI (--dry-run to skip)
- Outputs markdown to reports/release-notes-YYYY-MM-DD.md

Usage: python generate_release_notes.py --days 7
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 17, 2026

Greptile Summary

This PR adds generate_release_notes.py, a standalone script that queries merged PRs across 13 HydraDB repositories via the gh CLI, categorizes them by keyword, optionally generates an AI executive summary via OpenAI, and writes a dated markdown report to reports/.

Two correctness defects need attention before this produces reliable output:

  • Bot detection is broken: pr["author"].get("is_bot", False) always returns False because gh pr list --json author exposes login/name, not is_bot. Every bot PR inflates the human contributor count and the "Automated" footer never renders.
  • Silent PR truncation: --limit 100 with no server-side date filter means any repo that merged >100 PRs in the lookback window silently omits the oldest ones. Adding --search "merged:>=DATE" to the gh command eliminates the gap.

Confidence Score: 4/5

Safe to merge with caveats — the script is additive and not in any production path, but two correctness bugs mean the output will have wrong contributor counts and may silently omit PRs on busy repos.

Two P1 findings: (1) bot detection always returns False, causing bot PRs to inflate human contributor stats and the bot summary line to never render; (2) the hardcoded --limit 100 without a server-side date filter silently drops PRs on high-velocity repos. The script is standalone and write-only so these bugs don't cascade, but they do produce incorrect release notes.

generate_release_notes.py — specifically the fetch_merged_prs function (limit/search filter) and the contributor stats block (bot detection field name).

Important Files Changed

Filename Overview
generate_release_notes.py New script that fetches merged PRs via gh CLI and generates categorized markdown release notes; has two correctness bugs: bot detection always returns False (wrong field name), and --limit 100 without server-side date filtering silently drops PRs on high-velocity repos.

Sequence Diagram

sequenceDiagram
    participant User
    participant Script as generate_release_notes.py
    participant GH as gh CLI / GitHub API
    participant OAI as OpenAI API
    participant FS as reports/ directory

    User->>Script: python generate_release_notes.py --days 7
    loop For each of 13 repos
        Script->>GH: gh pr list --state merged --limit 100 --json ...
        GH-->>Script: list of PRs (up to 100)
        Script->>Script: filter by merged_at >= since (client-side)
    end
    Script->>Script: categorize PRs by title keywords
    alt not --dry-run and OPENAI_API_KEY set
        Script->>OAI: chat.completions.create(gpt-4o-mini)
        OAI-->>Script: executive summary text
    end
    Script->>Script: build markdown (categories, contributors, bot count)
    Script->>FS: write release-notes-YYYY-MM-DD.md
    FS-->>User: file path printed to stdout
Loading

Reviews (1): Last reviewed commit: "feat: add weekly release notes generator..." | Re-trigger Greptile

Comment thread generate_release_notes.py
Comment on lines +67 to +73
cmd = [
"gh", "pr", "list",
"--repo", f"{owner}/{name}",
"--state", "merged",
"--json", "number,title,author,mergedAt,url,body,labels",
"--limit", "100",
]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 --limit 100 silently drops PRs on active repos

gh pr list returns the 100 most recently merged PRs and date filtering happens client-side. For any repo that merged more than 100 PRs within the lookback window the script silently omits the oldest ones — there is no warning or error. For cortex-application or cortex-ingestion during a busy sprint this is a real risk.

Adding a server-side date filter via --search and raising the limit eliminates the gap:

Suggested change
cmd = [
"gh", "pr", "list",
"--repo", f"{owner}/{name}",
"--state", "merged",
"--json", "number,title,author,mergedAt,url,body,labels",
"--limit", "100",
]
cmd = [
"gh", "pr", "list",
"--repo", f"{owner}/{name}",
"--state", "merged",
"--json", "number,title,author,mergedAt,url,body,labels",
"--search", f"merged:>={since.strftime('%Y-%m-%d')}",
"--limit", "500",
]

This lets GitHub pre-filter by merge date and the higher ceiling keeps the client-side pass as a safety net only.

Comment thread generate_release_notes.py
"gh", "pr", "list",
"--repo", f"{owner}/{name}",
"--state", "merged",
"--json", "number,title,author,mergedAt,url,body,labels",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unused fields fetched from GitHub API

body and labels are included in the --json fields but are never referenced anywhere in the script. Removing them reduces payload size and keeps the API call minimal.

Suggested change
"--json", "number,title,author,mergedAt,url,body,labels",
"--json", "number,title,author,mergedAt,url",

Comment thread generate_release_notes.py
Comment on lines +243 to +250
if not pr["author"].get("is_bot", False):
contributors[author] += 1
for author, count in sorted(contributors.items(), key=lambda x: -x[1]):
lines.append(f"- @{author} ({count} PRs)")
lines.append("")

# Bot contributions
bot_count = sum(1 for _, pr in all_prs if pr["author"].get("is_bot", False))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Bot detection always evaluates to False

gh pr list --json author does not include an is_bot field — the returned object only contains login (and sometimes name). As a result, pr["author"].get("is_bot", False) is always False, every bot PR gets counted in the human contributors list, and bot_count is permanently 0 so the "Automated (Vorflux bot)" line never appears.

The reliable fix is to check whether the login ends with [bot], which is GitHub's naming convention for all Apps and automation accounts (e.g. renovate[bot], dependabot[bot], vorflux[bot]). Replace pr["author"].get("is_bot", False) with pr["author"].get("login", "").endswith("[bot]") in both the contributors filter (line 243) and the bot_count sum (line 250).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant