Git Commit Message Generator

AI-powered CLI tool that automatically generates professional git commit messages from code diffs using a fine-tuned Qwen-0.5B language model.

Live Demo: HuggingFace Model

Features

Accurate - Trained on 100K+ real commit messages
Fast - Generates messages in ~2 seconds
Consistent - Deterministic output (same diff = same message)
Clean - Professional, concise commit messages
Easy - Simple CLI: commit-gen generate

Demo

project_demo.mov

Click to watch the tool in action - generating commit messages from git diffs

Quick Start

Installation

# Clone the repository
git clone https://github.com/rajtiwariee/GitCommitGenerator
cd GitCommitGenerator

# Install dependencies
pip install -r requirements.txt

# Install CLI tool
pip install -e .

Usage

# Generate message for staged changes
git add file.py
commit-gen generate

# Generate and auto-commit
commit-gen generate --commit

# Interactive mode (edit before committing)
commit-gen generate --commit --interactive

# View configuration
commit-gen config --show

Model Training

Dataset Preparation

# Download CommitPackFT dataset
python scripts/download_dataset.py

# Clean and filter data
python scripts/prepare_dataset.py

# Format for training
python scripts/format_for_training.py

Fine-tuning

# Train with LoRA (recommended: ~7.5 hours on T4 GPU)
python train.py \
  --num_epochs 3 \
  --batch_size 4 \
  --gradient_accumulation_steps 8 \
  --use_lora \
  --gradient_checkpointing \
  --max_length 384 \
  --use_fp16

Evaluation

# Run automated metrics
python scripts/test_model.py --num_samples 100

# Interactive testing
python scripts/qualitative_test.py

# Plot training loss
python scripts/plot_loss.py

Model Details

Base Model: Qwen-0.5B
Training Method: LoRA (Low-Rank Adaptation)
Dataset: CommitPackFT (55K filtered samples)
Hardware: T4 GPU (16GB VRAM)
Training Time: ~7.5 hours
Parameters: 493M (base) + 35MB (LoRA adapters)

Training Configuration:

Epochs: 3
Batch Size: 4 (effective: 32 with gradient accumulation)
Learning Rate: 5e-5
LoRA r=16, alpha=32
Mixed Precision (FP32 + FP16)

Training Loss

The model converged smoothly from an initial loss of 1.68 to a final loss of ~0.87 over 3 epochs.

Project Structure

GitCommitGenerator/
├── commit_gen/              # CLI tool package
│   ├── cli.py              # Main CLI interface
│   ├── generator.py        # Model inference
│   ├── git_utils.py        # Git integration
│   └── config.py           # Configuration
├── scripts/                # Utilities
│   ├── download_dataset.py
│   ├── prepare_dataset.py
│   ├── test_model.py
│   ├── qualitative_test.py
│   └── upload_to_huggingface.py
├── train.py                # Training script
├── setup.py                # Package setup
└── requirements.txt        # Dependencies

Documentation

CLI_README.md - Detailed CLI usage guide
DEMO_SCRIPT.md - Demo/presentation script

Publishing to HuggingFace

# Set your token
export HF_TOKEN="your_token_here"

# Upload model
python scripts/upload_to_huggingface.py \
  --model_path ./qwen-0.5b-finetuned/final \
  --repo_name your-username/model-name \
  --metrics_file evaluation_results.json

License

MIT

Author

Raj Tiwari (@rajtiwariee)

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
assets		assets
commit_gen		commit_gen
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
evaluation_results.json		evaluation_results.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
test_base_model.py		test_base_model.py
test_nb.ipynb		test_nb.ipynb
train.py		train.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Git Commit Message Generator

Features

Demo

Quick Start

Installation

Usage

Model Training

Dataset Preparation

Fine-tuning

Evaluation

Model Details

Training Loss

Project Structure

Documentation

Publishing to HuggingFace

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Git Commit Message Generator

Features

Demo

Quick Start

Installation

Usage

Model Training

Dataset Preparation

Fine-tuning

Evaluation

Model Details

Training Loss

Project Structure

Documentation

Publishing to HuggingFace

License

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages