This project implements a complete collision detection system using sensor fusion of 3D Lidar point clouds and 2D camera images. The system tracks vehicles in front of the ego vehicle and computes Time-to-Collision (TTC) using both Lidar and camera-based measurements from the KITTI dataset.
The pipeline combines YOLO-based object detection, keypoint-based feature tracking, and 3D-2D sensor fusion to provide robust TTC estimates for autonomous vehicle applications.
- 3D Object Detection & Tracking: Uses YOLOv3 for 2D bounding box detection and associates Lidar points with detected vehicles
- Multi-Frame Bounding Box Matching: Tracks vehicles across consecutive frames using keypoint correspondences
- Lidar-Based TTC: Computes collision time using 3D distance measurements with outlier rejection
- Camera-Based TTC: Estimates TTC from relative scale changes in keypoint matches
- Multiple Detector/Descriptor Support: Configurable feature detection (HARRIS, FAST, BRISK, ORB, AKAZE, SIFT) and description (BRIEF, ORB, FREAK, AKAZE, SIFT, BRISK)
- OpenCV 4.x (with contrib modules for xfeatures2d)
- C++11 or higher
- CMake 3.5 or higher
- KITTI dataset images and Lidar data
- YOLOv3 weights and configuration files
cd 3D-Object-Tracking
mkdir build && cd build
cmake ..
make
./CDS3D-Object-Tracking/
├── src/
│ ├── CDS.cpp # Main program loop
│ ├── camFusion.cpp # TTC computation and bounding box matching
│ ├── lidarData.cpp # Lidar point cloud processing
│ ├── objectDetection2D.cpp # YOLO-based vehicle detection
│ └── matching2D.cpp # Keypoint detection and matching
├── dat/yolo/ # YOLOv3 model files
├── images/KITTI/ # KITTI dataset images
└── ttc_results/ # Comprehensive TTC evaluation data
Implemented matchBoundingBoxes() using a multi-map structure that counts keypoint matches between all bounding box pairs. For each match, the algorithm identifies which bounding boxes (from previous and current frames) contain the matched keypoints. A 2D map bbMatchCounts[prevBoxID][currBoxID] accumulates occurrence counts. Each previous frame bounding box is then matched to the current frame bounding box with the highest correspondence count, ensuring robust tracking even with partial occlusions.
Implemented computeTTCLidar() using median-based distance estimation for robust outlier handling. The algorithm extracts all valid X-coordinates (forward distances) from Lidar points in both frames, sorts them, and computes the median values rather than minimum values. This approach is statistically robust against outlier points caused by reflections, sensor noise, or vehicle geometry edges. TTC is computed using the constant velocity model: TTC = medianDistCurr * dT / (medianDistPrev - medianDistCurr).
Implemented clusterKptMatchesWithROI() with a two-pass approach. First pass computes the mean Euclidean distance between all matched keypoint pairs that lie within the bounding box ROI. Second pass filters matches, retaining only those with distances less than 1.5× the mean distance. This threshold-based outlier removal eliminates erroneous matches from background clutter or misdetected features while preserving genuine vehicle-based correspondences in the boundingBox.kptMatches vector.
Implemented computeTTCCamera() using distance ratio analysis between all keypoint pair combinations. For each pair of matched keypoints, the algorithm computes the Euclidean distance in the image plane for both current and previous frames. Distance ratios (distCurr/distPrev) are collected for all valid pairs, then sorted to extract the median ratio. Using the median (rather than mean) provides robustness against outlier matches. TTC is computed from the median distance ratio using: TTC = -dT / (1 - medianDistRatio).
Detailed analysis in: ttc_results/Analysis_Lidar.md
Summary of Key Findings:
Out of 18 analyzed frames, 5 frames (27.8%) exhibited critical TTC estimation errors:
Detailed analysis in: ttc_results/Analysis_Camera.md
Summary of Detector/Descriptor Performance:
All 21 detector-descriptor combinations were evaluated using Mean Absolute Error (MAE), Median Absolute Error, and robustness metrics. Results are ranked based on overall accuracy and reliability:
This project uses the KITTI Vision Benchmark Suite:
- Left color camera images
- Velodyne point clouds (velodyne_points)
- Calibration data for sensor alignment
Performance analysis and detailed evaluation reports are available in:
- 'feature-tracking/' - Implementation and evaluation of detectors and descriptors
3D-Object-Tracking/ttc_results/- Complete lidar and camera TTC evaluation with all detector/descriptor combinations
- This project was completed as part of the Udacity Sensor Fusion Nanodegree