Skip to content

YJLeonMan/Py-EmotionVA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Q-EmotionVA: Facial Emotion Recognition with Valence-Arousal Estimation 🎭

Python PyTorch CUDA License


🌟 Project Overview

Q-EmotionVA is an advanced deep learning framework for real-time facial emotion recognition with valence-arousal estimation. Built on state-of-the-art computer vision techniques, this project achieves exceptional performance on the AffectNet dataset.

✨ Key Capabilities

Feature Description
😊 Emotion Classification Recognize 8 basic emotions (Neutral, Happiness, Sadness, Surprise, Fear, Disgust, Anger, Contempt)
πŸ“Š Valence-Arousal Regression Estimate continuous emotional dimensions
⚑ Real-time Inference Process video streams at high frame rates
πŸš€ Multi-backbone Support Choose from MixedFeatureNet, MobileNetV2, or ShuffleNetV2
πŸ“± Edge Deployment Export models to ONNX format for edge devices

πŸ“ Project Structure

Q-EmotionVA/
β”œβ”€β”€ 🧠 models/                    # Neural network architectures
β”‚   β”œβ”€β”€ MixedFeatureNet.py        # Custom feature extraction backbone
β”‚   β”œβ”€β”€ DDAM.py                   # Attention-enhanced emotion model
β”‚   β”œβ”€β”€ DDAM-mbnet.py             # MobileNetV2 variant
β”‚   └── DDAM-shufflenet.py        # ShuffleNetV2 variant
β”œβ”€β”€ πŸ› οΈ tools/                     # Utility scripts
β”‚   β”œβ”€β”€ affectnet_train.py        # Training pipeline
β”‚   β”œβ”€β”€ affectnet_test.py         # Evaluation with confusion matrix
β”‚   β”œβ”€β”€ video-test-mediapipe.py   # Real-time webcam demo
β”‚   β”œβ”€β”€ video-test-onnx.py        # ONNX-based real-time demo
β”‚   β”œβ”€β”€ pth2onnx.py               # Model conversion tool
β”‚   └── data-handler.py           # Dataset preprocessing
β”œβ”€β”€ πŸ’Ύ checkpoints/               # Trained model weights
└── πŸ“¦ pretrained/                # Pre-trained backbones

πŸ› οΈ Installation

Prerequisites

  • Python 3.8+
  • PyTorch 1.9+
  • CUDA 11.0+ (for GPU acceleration)

Install Dependencies

pip install torch torchvision numpy pandas opencv-python mediapipe onnxruntime-gpu matplotlib scikit-learn tqdm Pillow

Download Pre-trained Weights

Place these files in the pretrained/ directory:


πŸ“Š Dataset Preparation

AffectNet Dataset

  1. Download from AffectNet Official Website πŸ“₯
  2. Organize as follows:
AffectNetDataset/
β”œβ”€β”€ Manually_Annotated/
β”‚   β”œβ”€β”€ Manually_Annotated_Images/   # Raw images
β”‚   β”œβ”€β”€ training.csv                  # Training annotations
β”‚   └── validation.csv                # Validation annotations

Preprocess Dataset

python tools/data-handler.py

Output:

  • Cropped faces: tiny_facedetect_filter_annotated_images/ πŸ–ΌοΈ
  • Annotation JSON: tiny_facedetect_train_filter.json πŸ“„

πŸ‹οΈ Training

Basic Training Command

python tools/affectnet_train.py \
    --aff_path /path/to/affectnet \
    --batch_size 10 \
    --lr 0.0001 \
    --epochs 40 \
    --num_head 2 \
    --num_class 8

Training Parameters

Parameter Description Default
--aff_path Dataset root path /data/affectnet/
--batch_size Batch size 10
--lr Learning rate 0.0001
--epochs Training epochs 40
--num_head Attention heads 2
--num_class Emotion classes 8
--workers Data loading threads 0

Training Output

Models are saved in checkpoints/ with naming:

affecnet8_epoch{epoch}_acc{accuracy}.pth

πŸ§ͺ Testing

Evaluate Model Performance

python tools/affectnet_test.py \
    --aff_path /path/to/affectnet \
    --model_path checkpoints/affecnet8_epoch15_acc0.5587.pth \
    --num_head 2 \
    --num_class 8

Test Output

  • βœ… Validation accuracy
  • πŸ“Š Confusion matrix visualization (checkpoints/*.png)

🎬 Real-time Demo

PyTorch Webcam Demo

python tools/video-test-mediapipe.py

ONNX Webcam Demo (Faster)

python tools/video-test-onnx.py

Demo Features

Feature Description
🎯 Real-time face detection Powered by MediaPipe
πŸ“ˆ Emotion probability bars Visualize confidence scores
πŸ“‰ Valence-Arousal indicators Real-time emotional state tracking
⌨️ Exit Press q to quit

πŸ”„ Model Conversion

Convert to ONNX Format

python tools/pth2onnx.py

Output: checkpoints/mp_MFN_epochXX.onnx πŸš€

Use Case: Edge deployment, TensorRT optimization, cross-platform inference


🧠 Model Architecture

Network Overview

Input (112x112x3)
    ↓
Backbone (MixedFeatureNet)
    ↓
Feature Maps (7x7x512)
    ↓
Coordinate Attention Heads
    ↓
Feature Fusion
    ↓
Classification Head β†’ Emotion Probabilities (8 classes)
    ↓
Regression Head β†’ Valence, Arousal

Attention Mechanism

The attention module captures spatial information through:

  1. Horizontal Pooling πŸ”Ή - Capture height-wise patterns
  2. Vertical Pooling πŸ”Έ - Capture width-wise patterns
  3. Channel Interaction πŸ”„ - Fuse spatial information
  4. Adaptive Weighting βš–οΈ - Apply learned attention

Loss Function

Combined objective for multi-task learning:

Loss Component Purpose Weight
Cross-entropy Emotion classification 1.0
Attention Diversity Encourage diverse feature learning 0.1
CCC Loss Valence regression 2.5
CCC Loss Arousal regression 2.5

πŸ“ˆ Performance

AffectNet Results

Backbone Accuracy Valence CCC Arousal CCC
MixedFeatureNet 55.87% 0.68 0.65
MobileNetV2 54.23% 0.66 0.63
ShuffleNetV2 53.89% 0.65 0.62

πŸ“ Citation

If you use this work in your research, please cite:

@article{Q-EmotionVA,
    title={Q-EmotionVA: Facial Emotion Recognition with Valence-Arousal Estimation},
    author={Your Name},
    journal={arXiv preprint arXiv:XXXX.XXXXX},
    year={2024}
}

πŸ“„ License

This project is licensed under the MIT License - see LICENSE for details.


πŸ™ Acknowledgments


Built with ❀️ for emotion AI research

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages