Skip to content

rajshekharbind/Sign-Language-Translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤟 Sign Language Translator

A real-time web application that enables seamless communication between hearing people and people who use sign language through AI-powered speech-to-sign and sign-to-voice translation.


Screenshot 2026-03-18 005627 Screenshot 2026-03-18 005601 Screenshot 2026-03-18 005502 Screenshot 2026-03-18 005519 Screenshot 2026-03-18 012415 Screenshot 2026-03-18 012429

📋 Table of Contents


Screenshot 2026-03-18 005533

🌟 Features

Voice to Sign Language

  • Real-time Speech Recognition: Uses Web Speech API for accurate speech-to-text
  • Instant Translation: Converts spoken words to sign language animations/videos
  • Interactive Display: Shows sign videos with descriptions and pronunciation guides
  • Multi-language Support: Supports multiple languages (extensible)

Sign Language to Voice

  • Hand Tracking: Real-time hand landmark detection using MediaPipe
  • Gesture Recognition: AI-powered gesture classification using LSTM neural networks
  • Voice Output: Converts recognized gestures to speech using Web Speech API
  • Visual Feedback: Shows hand landmarks and confidence scores during recognition

Technical Features

  • ✅ Responsive Web Design (works on desktop and tablets)
  • ✅ Real-time Camera & Microphone Access
  • ✅ GPU-accelerated Hand Tracking
  • ✅ Scalable REST API Backend
  • ✅ Easy Model Training Pipeline
  • ✅ Comprehensive Error Handling
  • ✅ Logging and Monitoring

Screenshot 2026-03-18 005652

🏗️ System Architecture

┌─────────────────────────────────────────────────────────────┐
│                    User Browser                              │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  React Frontend                                        │ │
│  │  ├── Speech Recognition Component                     │ │
│  │  ├── Camera/Hand Tracking Component                   │ │
│  │  ├── Sign Display Component                           │ │
│  │  └── Voice Output Component                           │ │
│  └────────────────────────────────────────────────────────┘ │
│                         ↓                                     │
│                    REST API (CORS)                            │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│              Python FastAPI Backend                           │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  API Routes                                            │ │
│  │  ├── /api/voice-to-sign (POST)                        │ │
│  │  ├── /api/classify-gesture (POST)                     │ │
│  │  └── /api/signs, /api/gestures (GET)                 │ │
│  └────────────────────────────────────────────────────────┘ │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  AI/ML Pipeline                                        │ │
│  │  ├── Gesture Classifier (LSTM Model)                  │ │
│  │  ├── Hand Tracker (MediaPipe)                         │ │
│  │  └── Sign Mapping Database                            │ │
│  └────────────────────────────────────────────────────────┘ │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  Services                                              │ │
│  │  ├── Data Preprocessing                               │ │
│  │  ├── Model Management                                 │ │
│  │  └── Static File Serving                              │ │
│  └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

💻 Tech Stack

Frontend

  • React 18.2 - UI Framework
  • TailwindCSS 3.3 - Styling
  • Axios - HTTP Client
  • MediaPipe Hands - Hand Detection (Browser)
  • Web Speech API - Speech Recognition & Text-to-Speech
  • react-webcam - Camera Access

Backend

  • FastAPI 0.104 - Web Framework
  • Python 3.9+ - Language
  • TensorFlow 2.13 - Deep Learning
  • MediaPipe 0.10 - Hand Tracking
  • OpenCV 4.8 - Image Processing
  • NumPy & SciPy - Numerical Computing

DevOps

  • Docker - Containerization
  • Docker Compose - Multi-container Orchestration

🚀 Quick Start

Prerequisites

  • Node.js 16+ and npm
  • Python 3.9+
  • Git

1️⃣ Clone and Setup

# Clone repository
git clone https://github.com/yourusername/sign-language-translator.git
cd sign-language-translator

# Setup Backend
cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

# Setup Frontend
cd ../frontend
npm install

2️⃣ Configure Environment

# Backend
cat > backend/.env << EOF
BACKEND_HOST=0.0.0.0
BACKEND_PORT=8000
BACKEND_RELOAD=true
FRONTEND_URL=http://localhost:3000
EOF

# Frontend
cat > frontend/.env << EOF
REACT_APP_API_URL=http://localhost:8000/api
REACT_APP_ENV=development
EOF

3️⃣ Run the Application

# Terminal 1: Start Backend
cd backend
python run.py
# Backend will be available at http://localhost:8000

# Terminal 2: Start Frontend
cd frontend
npm start
# Frontend will be available at http://localhost:3000

4️⃣ Access the Application

Open your browser and go to: http://localhost:3000


📦 Installation

Detailed Backend Setup

cd backend

# Create virtual environment
python -m venv venv

# Activate virtual environment
# Linux/Mac:
source venv/bin/activate
# Windows:
venv\Scripts\activate

# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt

# Verify installation
python -c "import tensorflow; print('TensorFlow OK')"
python -c "import mediapipe; print('MediaPipe OK')"

Detailed Frontend Setup

cd frontend

# Install dependencies
npm install

# Install additional packages if needed
npm install axios react-webcam

# Verify installation
npm list react react-dom axios

🎮 Usage

Voice to Sign Language

  1. Click "Start Speaking" button
  2. Speak clearly into your microphone
  3. See the translation - The spoken text appears
  4. Watch the sign - Sign video/animation displays
  5. Use video controls to replay or slow down

Sign Language to Voice

  1. Allow camera access when prompted
  2. Click "Start Recording" to begin
  3. Perform sign gestures in front of camera
  4. Hold gesture for 1-2 seconds
  5. Click "Stop & Classify" to process
  6. Listen to the voice output - Recognition speaks the result

🤖 AI Model Training

Generate Synthetic Data

cd backend

# Create synthetic training data
python ai_model/train.py --synthetic --epochs 50

Train with Real Data

# Prepare data structure
mkdir -p data/processed/hello
mkdir -p data/processed/goodbye
mkdir -p data/processed/thank_you
# ... add more classes as needed

# Each .npy file should contain shape (30, 126) - 30 frames of hand landmarks

# Train model
python ai_model/train.py --data-dir ./data/processed --epochs 100 --batch-size 32

Model Output

Model saved to: backend/ai_model/gesture_classifier.h5
Training complete! 
Final accuracy: 0.95

🐳 Docker Deployment

Build and Run with Docker

# Build images
docker-compose build

# Start services
docker-compose up

# Stop services
docker-compose down

# View logs
docker-compose logs -f backend
docker-compose logs -f frontend

Docker Services


📚 API Documentation

Endpoints

1. Health Check

GET /
GET /health

Response:
{
  "status": "healthy",
  "message": "Sign Language Translator API is running",
  "version": "1.0.0"
}

2. Voice to Sign

POST /api/voice-to-sign
Content-Type: application/json

{
  "text": "Hello, how are you?"
}

Response:
{
  "sign": "hello",
  "media_url": "/static/signs/hello.mp4",
  "message": null
}

3. Gesture Classification

POST /api/classify-gesture
Content-Type: application/json

{
  "landmarks": [
    [[0.5, 0.5, 0.0], [0.6, 0.4, 0.1], ...],  // Frame 1
    [[0.5, 0.5, 0.0], [0.6, 0.4, 0.1], ...],  // Frame 2
    ...
  ]
}

Response:
{
  "gesture": "hello",
  "confidence": 0.95,
  "all_predictions": {
    "hello": 0.95,
    "goodbye": 0.03,
    "thank_you": 0.02
  }
}

4. Get Available Signs

GET /api/signs

Response:
{
  "signs": ["hello", "goodbye", "thank_you", ...],
  "total": 50,
  "data": {...}
}

5. Get Available Gestures

GET /api/gestures

Response:
{
  "gestures": ["hello", "goodbye", "thank_you", ...],
  "total": 10,
  "model_loaded": true
}

Interactive API Docs


🔧 Configuration

Backend Configuration (backend/config.py)

# Server
BACKEND_HOST = "0.0.0.0"
BACKEND_PORT = 8000
BACKEND_RELOAD = True

# CORS
ALLOWED_ORIGINS = [
    "http://localhost:3000",
    "http://127.0.0.1:3000"
]

# Model
MODEL_INPUT_SHAPE = (30, 126)
MODEL_CONFIDENCE_THRESHOLD = 0.7
MAX_NUM_HANDS = 2
MIN_DETECTION_CONFIDENCE = 0.5

Frontend Configuration (frontend/.env)

REACT_APP_API_URL=http://localhost:8000/api
REACT_APP_ENV=development
REACT_APP_API_TIMEOUT=30000
REACT_APP_LOG_LEVEL=debug

📖 Project Structure

sign-language-translator/
│
├── backend/
│   ├── app/
│   │   ├── __init__.py
│   │   ├── main.py                 # FastAPI app
│   │   ├── models/
│   │   │   ├── schemas.py          # Pydantic models
│   │   │   ├── sign_mapping.py     # Sign database
│   │   │   └── __init__.py
│   │   ├── routes/
│   │   │   ├── voice_to_sign.py    # Speech→Sign endpoints
│   │   │   ├── gesture_classify.py # Gesture endpoints
│   │   │   └── __init__.py
│   │   ├── services/
│   │   │   ├── preprocessing.py    # Data processing
│   │   │   └── __init__.py
│   │   ├── utils/
│   │   │   ├── hand_tracking.py    # MediaPipe wrapper
│   │   │   └── __init__.py
│   │   └── static/
│   │       └── signs/              # Sign videos/images
│   │
│   ├── ai_model/
│   │   ├── gesture_model.py        # LSTM model
│   │   ├── train.py                # Training script
│   │   └── __init__.py
│   │
│   ├── data/
│   │   ├── raw/                    # Raw training data
│   │   └── processed/              # Processed data
│   │
│   ├── logs/                        # Application logs
│   ├── config.py                   # Configuration
│   ├── run.py                      # Entry point
│   ├── requirements.txt            # Dependencies
│   ├── .env                        # Environment variables
│   └── .gitignore
│
├── frontend/
│   ├── public/
│   │   ├── index.html              # Entry HTML
│   │   └── favicon.ico
│   │
│   ├── src/
│   │   ├── components/
│   │   │   ├── Camera.jsx          # Hand tracking
│   │   │   ├── SpeechInput.jsx     # Speech recognition
│   │   │   ├── SignDisplay.jsx     # Sign display
│   │   │   └── VoiceOutput.jsx     # Text-to-speech
│   │   │
│   │   ├── pages/
│   │   │   └── Home.jsx            # Main page
│   │   │
│   │   ├── services/
│   │   │   └── api.js              # API client
│   │   │
│   │   ├── utils/
│   │   │   └── handTracking.js     # Hand tracking utility
│   │   │
│   │   ├── App.js                  # Root component
│   │   ├── index.js                # React mount
│   │   └── index.css               # Global styles
│   │
│   ├── .env                        # Environment variables
│   ├── .gitignore
│   ├── package.json                # Dependencies
│   ├── tailwind.config.js          # Tailwind config
│   └── postcss.config.js           # PostCSS config
│
├── docker-compose.yml              # Docker setup
├── Dockerfile.backend              # Backend container
├── Dockerfile.frontend             # Frontend container
├── README.md                       # This file
└── .gitignore

🔍 Troubleshooting

Camera/Microphone Not Working

  • Check browser permissions
  • Ensure HTTPS on production (required by browsers)
  • Check device permissions in OS settings

Model Not Loading

# Train a new model
cd backend
python ai_model/train.py --synthetic

# Or use mock predictions (built-in fallback)

CORS Errors

Update ALLOWED_ORIGINS in backend/config.py:

ALLOWED_ORIGINS = [
    "http://localhost:3000",
    "http://yourdomain.com"
]

Port Already in Use

# Kill process on port 8000
lsof -ti:8000 | xargs kill -9

# Or use different port
BACKEND_PORT=8001 python run.py

🚀 Deployment Guide

Cloud Deployment (Heroku Example)

# Install Heroku CLI
heroku login

# Create app
heroku create your-app-name

# Set environment
heroku config:set BACKEND_HOST=0.0.0.0

# Deploy
git push heroku main

Production Checklist

  • Set BACKEND_RELOAD=false
  • Use strong CORS origins
  • Enable HTTPS
  • Set up logging/monitoring
  • Configure database (optional)
  • Train model on real data
  • Test all endpoints
  • Set up CI/CD pipeline

📚 Learning Resources


🤝 Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open Pull Request

📄 License

This project is licensed under the MIT License - see LICENSE file for details.


🙏 Acknowledgements

  • MediaPipe for hand detection
  • FastAPI for the backend framework
  • React team for the frontend library
  • TensorFlow for deep learning

📞 Support

For issues or questions:


Made with ❤️ for accessibility and inclusion

About

A real-time web application that enables seamless communication between hearing people and people who use sign language through AI-powered speech-to-sign and sign-to-voice translation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors