Sentiment Analysis — DistilBERT Fine-tuned on SST-2

Binary sentiment classifier (Positive/Negative) built by fine-tuning DistilBERT on the Stanford Sentiment Treebank (SST-2) dataset. Served via a FastAPI REST API with single and batch prediction endpoints.

Results

Metric	Value
Model	distilbert-base-uncased
Dataset	SST-2 (GLUE benchmark)
Validation Accuracy	90.71%
Weighted F1	90.70%
F1 — Negative class	90.32%
F1 — Positive class	91.07%
Training time	~15 min (RTX 3050 Laptop 4GB)
Inference (single)	<10ms on GPU
Published DistilBERT	91.3% (within 0.6%)

Confusion Matrix

	Predicted Negative	Predicted Positive
Actual Negative	378 (TN)	50 (FP)
Actual Positive	31 (FN)	413 (TP)

Project Structure

sentiment-classifier/
├── data_exploration.py      # dataset analysis and length stats
├── train.py                 # fine-tuning with HuggingFace Trainer API
├── evaluate_model.py        # classification report + confusion matrix
├── app.py                   # FastAPI REST API
├── sentiment-model/
│   └── best/                # saved model weights + tokenizer
└── assets/
    └── confusion_matrix.png

Setup

git clone https://github.com/yourusername/sentiment-classifier
cd sentiment-classifier
pip install -r requirements.txt

Training

python train.py

Trains for 3 epochs on SST-2 (67,349 samples). Checkpoints saved after each epoch. Best model selected by validation accuracy.

Evaluation

python evaluate_model.py

Generates classification report and saves confusion matrix to assets/confusion_matrix.png.

Running the API

uvicorn app:app --reload --host 0.0.0.0 --port 8000

Interactive docs available at http://127.0.0.1:8000/docs

API Endpoints

`POST /predict`

// Request
{"text": "This movie was absolutely fantastic!"}

// Response
{
  "text": "This movie was absolutely fantastic!",
  "label": "Positive",
  "confidence": 0.9999
}

`POST /predict/batch`

// Request
{"texts": ["Brilliant film.", "Waste of two hours."]}

// Response
[
  {"text": "Brilliant film.",      "label": "Positive", "confidence": 0.9987},
  {"text": "Waste of two hours.",  "label": "Negative", "confidence": 0.9971}
]

`GET /health`

{"status": "ok", "device": "cuda", "model": "distilbert-base-uncased fine-tuned SST-2"}

Design Decisions

Dynamic padding — DataCollatorWithPadding pads each batch to its longest sequence rather than padding everything to 512. With SST-2's average sentence length of 9.4 words (~10 tokens), this reduces memory usage by over 4× per batch.

max_length=128 — 99th percentile sentence length in SST-2 is 35 words (~40 tokens after WordPiece). Using 128 instead of the default 512 gives comfortable headroom while keeping batches lean.

Warmup ratio — LR linearly ramps from 0 to 2e-5 over the first 10% of training steps. This protects pretrained weights from large gradient updates early in fine-tuning, which would cause catastrophic forgetting.

load_best_model_at_end=True — saves the checkpoint with the best validation accuracy, not the final epoch. Epoch 3 eval loss is typically slightly higher than epoch 2 due to minor overfitting — this ensures we always deploy the best checkpoint.

Tech Stack

transformers 4.40+ — model, tokenizer, Trainer API
datasets — SST-2 loading and preprocessing
evaluate — accuracy and F1 metrics
PyTorch — training backend
FastAPI + uvicorn — REST API
scikit-learn — confusion matrix and classification report
matplotlib — confusion matrix plot


Final project structure is complete:

sentiment-classifier/ ├── data_exploration.py ✅ ├── train.py ✅ ├── evaluate_model.py ✅ ├── app.py ✅ ├── requirements.txt ✅ ├── README.md ✅ ├── sentiment-model/ │ └── best/ ✅ └── assets/ └── confusion_matrix.png ✅

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis — DistilBERT Fine-tuned on SST-2

Results

Confusion Matrix

Project Structure

Setup

Training

Evaluation

Running the API

API Endpoints

`POST /predict`

`POST /predict/batch`

`GET /health`

Design Decisions

Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
sentiment-model/best		sentiment-model/best
README.md		README.md
app.py		app.py
data_exploration.py		data_exploration.py
evaluate_model.py		evaluate_model.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis — DistilBERT Fine-tuned on SST-2

Results

Confusion Matrix

Project Structure

Setup

Training

Evaluation

Running the API

API Endpoints

POST /predict

POST /predict/batch

GET /health

Design Decisions

Tech Stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`POST /predict`

`POST /predict/batch`

`GET /health`

Packages