Skip to content

zlyuan9/git-commit-finetune

Repository files navigation

commit-msg-finetune

Fine-tune a small language model on your own git history to generate commit messages locally. Zero latency, zero API cost.

git add .
python suggest.py
# -> "Add deck composition features to observation for card-counting"

How it works

  1. Extract (diff, commit_message) pairs from your GitHub repos
  2. Fine-tune Qwen3-0.6B with LoRA (~10 min on M-series Mac)
  3. Suggest commit messages from staged changes in <1 second

The model learns YOUR commit style — vocabulary, conventions, and what parts of a diff matter.

Requirements

  • Apple Silicon Mac (M1/M2/M3/M4) with 8+ GB RAM
  • Python 3.10+
  • GitHub CLI (gh) for fetching repos

Quickstart

# 1. Setup
git clone https://github.com/YOUR_USER/commit-msg-finetune.git
cd commit-msg-finetune
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 2. Build dataset from your GitHub repos
./fetch_and_extract.sh          # clones repos, extracts diffs
python build_dataset.py         # curates into train/valid/test

# 3. Train (~10 min)
make train

# 4. Use it
git add .
python suggest.py

Or step-by-step with make:

make fetch      # clone repos + extract pairs
make dataset    # curate + split
make train      # LoRA fine-tune
make suggest    # generate message for staged changes
make evaluate   # ROUGE-L scores on test set

Project structure

commit-msg-finetune/
├── fetch_and_extract.sh            # Clone GitHub repos + extract pairs
├── extract_diff_commit_pairs.py    # Git history -> (diff, message) JSONL
├── build_dataset.py                # Curate + split into train/valid/test
├── suggest.py                      # Inference: staged diff -> commit message
├── evaluate.py                     # Test set evaluation (ROUGE-L)
├── Makefile                        # Convenience commands
├── DEVELOPER_GUIDE.md              # Deep implementation guide
├── data/
│   ├── train.jsonl                 # Training examples (ChatML format)
│   ├── valid.jsonl                 # Validation set
│   └── test.jsonl                  # Test set
└── adapters/                       # LoRA weights (after training)

Configuration

Training defaults can be overridden via make variables:

make train MODEL=Qwen/Qwen3-4B ITERS=500 BATCH=2
Variable Default Description
MODEL Qwen/Qwen3-0.6B Base model (HuggingFace ID)
ITERS 300 Training iterations
BATCH 4 Batch size
LR 1e-4 Learning rate
MAX_SEQ 2048 Max sequence length
NUM_LAYERS 16 LoRA layers

Scaling up

If 0.6B quality isn't enough, try a larger model — same pipeline:

Model Params RAM needed Quality
Qwen3-0.6B 0.6B ~2 GB Good for common patterns
Qwen3-1.7B 1.7B ~4 GB Better phrasing
Qwen3-4B 4B ~8 GB Handles complex diffs
Qwen3-8B 8B ~14 GB Near-human quality
make train MODEL=Qwen/Qwen3-4B BATCH=2
python suggest.py --model Qwen/Qwen3-4B

How diffs are handled

Large diffs are compressed to fit the model's context window (same rules during training and inference):

Diff size Strategy
< 300 lines Full patch
300–1000 lines git diff --stat + first 300 lines
> 1000 lines git diff --stat only

License

MIT

About

Fine-tune a small language model on your mac using your own commit messages. Zero latency, zero API cost.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors