Traffic Sign Detection with YOLO

This model is a proof of concept designed to detect and classify United States traffic signs in real time to support drivers, Advanced Driver Assistance Systems (ADAS), and autonomous vehicles. The model uses YOLO Object Detection architectures and compares three versions: YOLOv8 (baseline), YOLOv11 (architectural improvements), and YOLOv26 (which introduces NMS-free predictions). All models were fine-tuned from pretrained YOLO weights using a custom dataset of 13,921 images across 21 classes. To improve performance in real-world driving conditions, training included data augmentations such as Blur, Motion Blur, and Camera Gain. The models were trained on an NVIDIA RTX 3080 GPU for 100 epochs with a batch size of 32 using an 82/12/6 Train/Validation/Test split. The goal of this project is to evaluate detection performance across different YOLO architectures while demonstrating a traffic sign detection system for real-world driving scenarios.

Training Data

Data Collection Methodology

The dataset was created by combining multiple publicly available traffic sign datasets from Roboflow Universe. The LISA dataset served as the baseline for class naming, and the class set was reduced from 47 to 21 based on the most commonly observed traffic signs across the collected datasets.

Annotation Process

The images in the combined datasets were already pre-annotated with tight bounding boxes around the traffic signs. The main modification involved was standardizing and updating the class names. Using the LISA dataset as the baseline for class naming, the original 47 classes were reduced to 21 classes representing the most commonly observed U.S. traffic signs. Existing annotations were reviewed and the class names were adjusted to match the new label set.

Dataset

Source Datasets:

New Custom Dataset:

8189 (Non-Augment) + (5,732 Augment)
21 Classes

Class Distribution

Class	Train	Valid	Test	Total
stop	2442	402	198	3042
pedestrianCrossing	1428	254	117	1799
speedLimit35	1110	158	62	1330
speedLimit25	1084	137	71	1292
speedLimit45	496	61	28	585
speedLimit30	470	51	25	546
yield	308	57	25	390
speedLimit40	302	38	20	360
speedLimit20	312	30	14	356
speedLimit65	276	34	9	319
speedLimit50	212	16	14	242
speedLimit15	210	16	4	230
speedLimit55	160	4	4	168
speedLimit70	114	6	2	122
doNotEnter	94	5	2	101
speedLimit80	82	4	0	86
speedLimit60	78	4	3	85
noLeftTurn	68	9	4	81
speedLimit75	66	0	0	66
noRightTurn	46	3	0	49
speedLimit85	46	3	0	49

Dataset Split

The dataset was divided into training, validation, and test sets using an 82% / 12% / 6% split:

Train: 11,415 images
Validation: 1,670 images
Test: 836 images

Augmentation

Blur (2x)
Motion Blur (100px)
Canera Gain (0.05)

Augmentations were applied to simulate real-world driving conditions such as motion and camera exposure changes that may occur in ADAS or autonomous driving systems.

Training Configuration

Hardware: NVIDIA RTX 3080 10GB
Architecture: YOLOv8s, YOLOv11s, YOLOv26s
Batch Size: 32
Epochs: 100
Imgsz: 640
Patience: 50

Evaluation Results

Model Performance Comparison

The table below shows the average class performance metrics for each YOLO model. All three models perform similarly across evaluation metrics, with only small differences between architectures.

Model	Precision (P)	Recall (R)	mAP50	mAP50-95
YOLOv8s	0.921	0.938	0.945	0.772
YOLOv11s	0.894	0.973	0.948	0.777
YOLOv26s	0.919	0.922	0.951	0.781

The three YOLO models show very similar overall performance, with only small differences across evaluation metrics. All models achieve high precision, recall, and mAP, indicating strong detection capability for the selected traffic sign classes. YOLOv11 performs slightly better overall, while YOLOv26 achieves nearly identical performance despite using an NMS-free prediction approach. YOLOv8 performs marginally lower but remains within a small margin of error.

F1 Curve

The F1-confidence curves show strong balance between precision and recall across all three models. The peak F1 score for all classes reaches approximately 0.92, indicating high overall detection performance, with all three models performing very similarly. Most classes maintain high F1 scores across confidence thresholds. However, the speedLimit55 (green) and speedLimit20 (brown) classes show noticeably lower curves. This likely reflects fewer training examples for these classes and the visual similarity between speed limit signs, where only the number differs, making them more difficult for the model to distinguish.

Confusion Matrix

The confusion matrix show strong classification performance across all three models, with most predictions showing up along the diagonal. This indicates that the majority of traffic sign classes are correctly classified with very few misclassifications. The three models display very similar patterns, suggesting consistent class recognition performance across architectures. Minor misclassifications occur between some speed limit classes, where predictions occasionally fall into adjacent speed categories.

Sample Images

The sample predictions demonstrate that all three models can accurately detect and localize traffic signs across a variety of road environments. Bounding boxes are placed correctly around most signs, and detection performance is generally consistent across the three models, although YOLOv26 occasionally struggles to detect the speedLimit55 sign compared to the other architectures. Overall, the predictions show strong and reliable detection performance across different road scenes and lighting conditions.

Real Time Video

Upscaling Comparison (1080p, 1440p, 2160p)

The upscaling video uses a 0.25 confidence threshold on YOLOv26 and shows that increasing the image resolution can help the model detect smaller or distant traffic signs by making important visual features more visible. However, some false positives or misclassifications may still occur when the original image lacks clear detail.

v11 (TTA) vs v26 (non-TTA)

This video compares detection performance between YOLOv11 using Test Time Augmentation (TTA) and YOLOv26 without TTA. The TTA comparison uses a 0.30 confidence threshold at 2160p. While TTA evaluates multiple augmented versions of each frame to improve detection robustness, it also introduced additional false positives in this example, occasionally predicting traffic signs on background objects.

Limitations

Known Failure Cases

The model may struggle when traffic signs are small, distant, or partially occluded within the frame. In some scenarios, the model may also produce false positives by incorrectly detecting traffic signs on background objects such as trees or other environmental features.

Poor Performing Classes

Certain speed limit classes show slightly lower performance compared to other traffic signs. In particular, speedLimit55 and speedLimit20 demonstrate lower F1 scores and occasional misclassifications. This may be due to fewer training examples as well as the visual similarity between speed limit signs, where only the number differs.

Data Biases

The dataset is composed of multiple publicly available traffic sign datasets and primarily contains images of U.S. traffic signs. As a result, the model may be biased toward U.S. signage and may not generalize well to traffic signs from other countries or regions.

Environmental Limitations

Model performance may degrade under challenging environmental conditions such as poor lighting. Background objects and natural features in the environment can also occasionally trigger false positives.

Sample Size Limitations

Although the dataset contains over 10,000 images, some traffic sign classes contain significantly fewer training samples than others. This class imbalance can affect model performance and contribute to lower detection accuracy for certain classes.

Inappropriate Use Cases

This model should not be used for safety-critical real-world driving systems without extensive validation and testing. The model is intended as a proof-of-concept for traffic sign detection and may not meet the reliability requirements required for deployment in autonomous vehicles or production ADAS systems.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support