🎯 Folio Finder AI — Video-Based Fall Detection

A deep learning system for real-time fall detection from video using the R(2+1)D-18 spatiotemporal convolutional neural network. Achieves 98.71% F1 score on a custom dataset of ~7,000 video clips.

📊 Results at a Glance

Metric	Score
F1 Score	98.71%
Accuracy	98.71%
Precision (Fall)	99%
Recall (Fall)	98%
Inference Time	<1 sec/video

🧠 How It Works

The system uses R(2+1)D-18, a factored 3D CNN that decomposes spatiotemporal convolutions into separate spatial (2D) and temporal (1D) components. This architecture:

Captures body posture (spatial) and motion dynamics (temporal) simultaneously
Uses transfer learning from Kinetics-400 (pretrained on 400 action classes)
Processes 16 frames per clip at 112×112 resolution
Outputs a binary classification: Fall or No Fall

Pipeline Overview

Video Input → Frame Extraction (16 frames) → Resize (112×112) → R(2+1)D-18 → Softmax → Fall / No Fall

Smart Temporal Sampling

Fall videos: Frames sampled from the latter half (falls typically occur at the end)
No-Fall videos: Frames sampled uniformly across the full duration

📁 Project Structure

Folio_Finder_AI/
├── train_fall_final.py           # Training pipeline
├── predict_fall.py               # Inference / prediction script
├── r2plus1d_fall_v3.pth          # Best model weights
├── r2plus1d_fall_checkpoint.pth  # Training checkpoint
├── videos_info.csv               # Full dataset catalog
├── train.csv                     # Training split
├── test.csv                      # Test split
├── confusion_matrix_v3.png       # Confusion matrix visualization
├── training_metrics_v3.png       # Training curves
├── requirements.txt              # Python dependencies
└── falldataset/
    ├── Fall/
    │   └── Raw_Video/            # Fall event clips
    └── Video/
        └── Raw_Video/            # No-fall activity clips

⚙️ Installation

Prerequisites

Python 3.8+
NVIDIA GPU with CUDA support
~10 GB disk space for dataset

Setup

# Clone the repository
git clone https://github.com/[your-username]/Folio_Finder_AI.git
cd Folio_Finder_AI

# Create virtual environment
python -m venv venv
source venv/bin/activate        # Linux/Mac
# venv\Scripts\activate         # Windows

# Install dependencies
pip install -r requirements.txt

Requirements

torch>=2.0.0
torchvision>=0.15.0
opencv-python>=4.8.0
pandas>=2.0.0
scikit-learn>=1.3.0
matplotlib>=3.7.0
tqdm>=4.65.0
numpy>=1.24.0

🏋️ Training

Quick Start

python train_fall_final.py

Training Configuration

Parameter	Value
Optimizer	Adam
Learning Rate	0.0001
Batch Size	16
Clip Length	16 frames
Input Resolution	112 × 112
Max Epochs	12
Early Stopping	Patience 4 (F1-based)
Mixed Precision	Enabled (AMP)
Class Weights	No_Fall: 0.899, Fall: 3.304

What Happens During Training

Loads train.csv / test.csv splits (or generates them from videos_info.csv)
Computes inverse-frequency class weights to handle class imbalance
Initializes R(2+1)D-18 with Kinetics-400 pretrained weights
Trains with weighted cross-entropy loss + mixed precision
Evaluates on test set after each epoch
Saves best model (by F1) and latest checkpoint
Generates confusion matrix and training curves

Output Files

File	Description
`r2plus1d_fall_v3.pth`	Best model weights
`r2plus1d_fall_checkpoint.pth`	Latest checkpoint (resumable)
`confusion_matrix_v3.png`	Test set confusion matrix
`training_metrics_v3.png`	Loss / Accuracy / F1 curves

🔮 Inference

Predict on a Single Video

python predict_fall.py "path/to/video.mp4"

Example Output

Loading model...
Processing video: test_fall.mp4
Reading frames: 16 frames extracted
Prediction: Fall (confidence: 98.72%)

Using in Python

import torch
from torchvision.models.video import r2plus1d_18
import cv2
import numpy as np

# Load model
model = r2plus1d_18(pretrained=False)
model.fc = torch.nn.Linear(512, 2)
model.load_state_dict(torch.load("r2plus1d_fall_v3.pth"))
model.eval()

# Process video (16 frames, 112x112, RGB)
# ... frame extraction logic ...

with torch.no_grad():
    output = model(video_tensor)
    pred = torch.softmax(output, dim=1)
    label = "Fall" if pred > pred[^1] else "No Fall"
    confidence = pred.max().item() * 100
    print(f"{label} ({confidence:.2f}%)")

📈 Dataset

Overview

Property	Value
Total clips	~6,982
Train set	~5,584 (80%)
Test set	1,398 (20%)
Classes	2 (Fall, No_Fall)
Avg duration	1–8 seconds
Frame rates	15–120 FPS
Resolutions	480p to 4K (normalized)

Sources

Public Kaggle datasets (Fall Detection Dataset, Fall Video Dataset)
Original recordings (smartphone, 1080p, 30fps — Sept 2024)
Research benchmarks (SisFall-derived, multi-camera setups)

Data Format

Each video is cataloged in videos_info.csv:

filename,path,num_frames,fps,width,height,duration_sec,label
example_fall.mp4,falldataset/Fall/Raw_Video/example_fall.mp4,57,30.0,1920,1080,1.9,0
example_nofall.mp4,falldataset/Video/Raw_Video/example_nofall.mp4,91,30.0,1100,1080,3.0,1

Note: Label 0 = Fall, Label 1 = No_Fall

🏆 Model Comparison

Method	Type	F1 / Accuracy	Hardware
R(2+1)D-18 (Ours)	Video	98.71%	RTX 3070
YOLOv8 + Transformer	Video	mAP 99.55%	High-end GPU
4S-3DCNN	Video	99.03%	Multi-GPU
CNN-LSTM	Video + Sensor	96.4%	GPU
DSCS	Sensor only	99.32%	CPU
Random Forest	Sensor only	97.47%	CPU
LSTM	Sensor only	80.0%	CPU

🛠️ Tech Stack

Deep Learning: PyTorch, torchvision
Video Processing: OpenCV
Data Management: pandas, NumPy
Evaluation: scikit-learn
Visualization: matplotlib
Training Optimization: CUDA AMP (mixed precision), DataLoader with pin_memory

📝 Training Logs

Click to expand full training history

Epoch  1/12 | Train Loss: 0.3154 | Train Acc: 84.28% | Test Acc: 92.27% | F1: 92.29% ★ New Best
Epoch  2/12 | Train Loss: 0.1993 | Train Acc: 90.69% | Test Acc: 87.84% | F1: 87.83%
Epoch  3/12 | Train Loss: 0.1522 | Train Acc: 93.66% | Test Acc: 93.56% | F1: 93.58% ★ New Best
Epoch  4/12 | Train Loss: 0.1195 | Train Acc: 94.77% | Test Acc: 97.28% | F1: 97.28% ★ New Best
Epoch  5/12 | Train Loss: 0.0848 | Train Acc: 96.26% | Test Acc: 97.21% | F1: 97.21%
Epoch  6/12 | Train Loss: 0.0686 | Train Acc: 97.47% | Test Acc: 97.71% | F1: 97.71% ★ New Best
Epoch  7/12 | Train Loss: 0.0627 | Train Acc: 97.53% | Test Acc: 96.85% | F1: 96.84%
Epoch  8/12 | Train Loss: 0.0660 | Train Acc: 97.71% | Test Acc: 97.50% | F1: 97.50%
Epoch  9/12 | Train Loss: 0.0424 | Train Acc: 98.55% | Test Acc: 98.71% | F1: 98.71% ★ New Best
Epoch 10/12 | Train Loss: 0.0466 | Train Acc: 98.28% | Test Acc: 98.21% | F1: 98.21%
Epoch 11/12 | Train Loss: 0.0370 | Train Acc: 98.39% | Test Acc: 96.35% | F1: 96.34%
Epoch 12/12 | Train Loss: 0.0375 | Train Acc: 98.71% | Test Acc: 97.07% | F1: 97.07%

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/improvement)
Commit changes (git commit -am 'Add new feature')
Push to branch (git push origin feature/improvement)
Open a Pull Request

📄 License

This project is licensed under the MIT License — see the LICENSE file for details.

👥 Authors

-Ali Abroudoust -Morteza Mohasebati

🙏 Acknowledgments

R(2+1)D paper by Tran et al. (CVPR 2018)
Kinetics-400 by DeepMind
PyTorch team for pretrained video models
Kaggle community for public fall detection datasets

Built with ❤️ and PyTorch

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
.vscode		.vscode
.gitattributes		.gitattributes
.gitignore		.gitignore
COMPLETE_FIX_GUIDE.md		COMPLETE_FIX_GUIDE.md
FINAL_FIX_BOTH_ERRORS.md		FINAL_FIX_BOTH_ERRORS.md
Fall Detection Paper.docx		Fall Detection Paper.docx
Fall Detection Paper.pdf		Fall Detection Paper.pdf
HOTFIX_CSV_MISMATCH.md		HOTFIX_CSV_MISMATCH.md
README.md		README.md
confusion_matrix_v3.png		confusion_matrix_v3.png
predict_fall.py		predict_fall.py
r2plus1d_fall_v3.pth		r2plus1d_fall_v3.pth
requirements.txt		requirements.txt
test.csv		test.csv
train.csv		train.csv
train_fall_final.py		train_fall_final.py
videos_info.csv		videos_info.csv

Folders and files

Latest commit

History

Repository files navigation

🎯 Folio Finder AI — Video-Based Fall Detection

📊 Results at a Glance

🧠 How It Works

Pipeline Overview

Smart Temporal Sampling

📁 Project Structure

⚙️ Installation

Prerequisites

Setup

Requirements

🏋️ Training

Quick Start

Training Configuration

What Happens During Training

Output Files

🔮 Inference

Predict on a Single Video

Example Output

Using in Python

📈 Dataset

Overview

Sources

Data Format

🏆 Model Comparison

🛠️ Tech Stack

📝 Training Logs

🤝 Contributing

📄 License

👥 Authors

🙏 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages