A simple neural network classifier that detects angry/urgent messages vs neutral ones using PyTorch and TF-IDF features.
- Binary text classification (angry vs neutral)
- TF-IDF vectorization with uni/bi-grams
- Small feedforward neural network
- Training with backpropagation
- Easy to adapt to your own data
# Clone the repository
git clone https://github.com/yourusername/angry-message-detector.git
cd angry-message-detector
# Install dependencies
pip install -r requirements.txtpython angry_message_detector.pyThis will train the model on the included example data and show predictions on test messages.
-
Prepare a CSV file with columns
textandlabel:text: your message contentlabel: 1 for angry/urgent, 0 for neutral
-
Place your CSV in the
data/directory (e.g.,data/my_messages.csv) -
Run the training script:
python train_with_csv.py --data data/my_messages.csv --epochs 50angry-message-detector/
├── README.md
├── requirements.txt
├── angry_message_detector.py # Main script with example data
├── train_with_csv.py # Script for training with CSV data
├── data/
│ └── example_messages.csv # Example training data
└── models/
└── .gitkeep
text,label
"This is unacceptable, I've asked three times already!",1
"Hello, could you please check my order status?",0
"I am extremely disappointed with your service.",1
"Thanks for your help, appreciated!",0- Text Vectorization: Converts messages to TF-IDF numerical features
- Neural Network: 2-layer feedforward network (input → 64 hidden units → output)
- Training: Uses binary cross-entropy loss and backpropagation
- Prediction: Outputs probability that a message is angry (threshold: 0.5)
Input (TF-IDF features) → Linear(hidden=64) → ReLU → Dropout(0.1) → Linear(1) → Sigmoid
Edit hyperparameters in the script:
# In angry_message_detector.py or train_with_csv.py
EPOCHS = 25 # Number of training passes
HIDDEN = 64 # Hidden layer size
LEARNING_RATE = 1e-3 # Optimizer learning rate
BATCH_SIZE = 4 # Mini-batch size
MAX_FEATURES = 300 # TF-IDF vocabulary sizeOn the small example dataset (10 messages), the model quickly learns to distinguish angry from neutral messages. For real-world use, you'll need:
- At least 100-1000 labeled examples
- Balanced classes (similar number of angry/neutral)
- Larger
max_features(e.g., 5000-20000)
See train_with_csv.py for examples of:
- Saving trained models to
models/ - Loading models for inference
- Saving the TF-IDF vectorizer
MIT
Pull requests are welcome! For major changes, please open an issue first.
Built with PyTorch and scikit-learn.