Bridges, as one of the essential infrastructures in modern cities, play a critical role in road transportation and are vital to urban development and economic prosperity1. However, with the increasing age of bridges and the impact of external environmental factors, various defects may develop in bridge structures, such as cracks, seepage, leakage, and honeycombing. These defects not only compromise the structural integrity of bridges but also pose a serious threat to urban traffic safety, potentially leading to bridge collapse and traffic accidents2.
In recent years, with the rapid development of deep learning, significant progress has been made in object detection technology. The core task of object detection technology is to identify and locate target objects in images or videos. Compared to traditional computer vision algorithms, deep learning, through Convolutional Neural Networks (CNN), automatically learns feature representations, greatly improving the accuracy and robustness of object detection algorithms3. Object detection technology has been widely applied across various disciplines, and its application in bridge defect detection also holds great potential.
Although CNNs demonstrate strong performance in object detection, successful model training typically relies on a large amount of high-quality, fully annotated datasets4. Especially when preparing bounding boxes for object detection tasks, image annotation is a time-consuming and labor-intensive process. To address this challenge, researchers often use existing large public datasets, such as ImageNet5, MS COCO6, and PASCAL VOC7, for model development and training. These datasets cover a wide range of object categories and scenes, providing rich resources for general object detection. However, the applicability of public datasets in bridge defect detection is relatively limited. Bridge surfaces often have complex structures and details, with defects manifesting as cracks, spalling, corrosion, and exposed rebar, among other fine features. These defects have delicate visual characteristics, irregular distributions, blurred boundaries, and exhibit diverse behaviors under varying lighting conditions. This complexity and diversity are often difficult to fully capture with existing public datasets.
Therefore, it is particularly necessary to create a high-quality dataset specifically for bridge defect detection, covering multiple defect types and sampled under different environmental conditions. Precise annotation should not only include bounding boxes but also record detailed features to improve the model’s detection accuracy and adaptability in this field8.
In the existing literature, publicly available datasets for bridge defect detection are relatively scarce, especially those specifically designed for object detection tasks. Most existing datasets are primarily intended for image classification rather than precise object localization. According to reference9, as of 2022, a total of 86 image datasets had been used in bridge inspection-related research, but only 26 of them were publicly accessible. In light of this, this paper selects and analyzes several representative and widely recognized high-quality datasets.
In 2017, study10 released the ‘GAPS’ (German Asphalt Pavement Distress) dataset, a small dataset for road surface defect classification, primarily aimed at detecting various common distresses on asphalt pavement. The dataset contains 1,969 sub-images, with the original image resolution of 1920 × 1080 and the cropped sub-image resolution of 64 × 64, covering defect types such as cracks, potholes, patches, and joints. In the same year, study11 constructed the ‘CSSC’ (Concrete Structure Spalling and Crack) dataset, a medium-sized dataset for classifying concrete structure defects, focusing on identifying cracks and spalling on concrete surfaces. The dataset consists of 278 spalling images and 954 crack images, which were further cropped to generate 37,523 sub-images categorized into four classes: crack, non-crack, spalling, and non-spalling. The sub-images were extracted using a sliding window technique from high-resolution originals, providing abundant high-quality training samples for deep learning models.
In addition, study12 introduced the ‘CSD’ (Cambridge Bridge Inspection Dataset), a small-scale bridge defect classification dataset focused on identifying concrete defects. The dataset contains a total of 1,028 images with a resolution of 299 × 299, including 691 images of healthy concrete and 337 images with defects. The defect types include cracks, graffiti, moss, and surface roughness. Subsequently, in 2018, study13 released the ‘SDNET2018’ dataset, which consists of 230 images of concrete surface cracks and non-cracks captured using a 16-megapixel Nikon digital camera. The images cover 54 bridge decks, 72 walls, and 104 sidewalks. After image segmentation, a total of 56,092 samples with a resolution of 256 × 256 were generated, including 47,608 crack images and 8,484 non-crack images, primarily used for training classification tasks in neural network models.
In 2019, study14 released the ‘MCDS’ (Multi-Classifier for Reinforced Concrete Bridge Defects) dataset, a medium-scale image classification dataset focused on concrete and reinforced bridge defects. The dataset contains a total of 3,607 images. Although the image resolutions are not explicitly specified, the content covers a variety of common structural defects, including concrete cracks, efflorescence, spalling, exposed reinforcement, corrosion, and surface scaling. It also includes some defect-free images as control samples to improve the model’s ability to recognize normal structures. Additionally, study15 constructed the ‘BiNet’ dataset for multi-label classification tasks in bridge defect detection. This dataset contains 3,588 images with varying resolutions, such as 455 × 186, 97 × 108, 123 × 131, and 153 × 68. The images cover four common types of concrete structural damage: cracks (1,330 images), spalling (240 images), corrosion (961 images), and exposed reinforcement (942 images). The data were mainly collected from field inspections of highway bridges in Slovenia between 2016 and 2018.
However, the aforementioned datasets do not provide detailed annotations of defects and are limited to classification tasks. In contrast, some datasets in recent years have begun to include defect annotation information. For example, in 2018, study16 developed the ‘Road Crack Detection’ dataset, a medium-scale object detection dataset focused on road surface defects. It aims to identify various types of cracks on asphalt pavement using object detection methods. The dataset contains a total of 9,053 images and supports training and testing with deep learning models such as YOLOv2. It includes annotations for eight types of cracks, including longitudinal cracks, transverse cracks, and alligator cracks, offering valuable image resources and application support for road maintenance and autonomous driving perception systems.
In 2019, study17 released the ‘CODEBRIM’ (COncrete DEfect BRidge IMage) dataset, which includes 1,590 images collected by cameras and drones across 30 bridges. The dataset covers six categories: cracks, weathering, spalling, exposed reinforcement, corrosion, and no defect. Among them, 1,052 images are annotated, with a total of 1,323 defect bounding boxes. The images vary in resolution, including 1920 × 400, 6000 × 4000, 1732 × 2596, and 1972 × 2960, providing high diversity and practical value. Subsequently, study18 introduced the ‘dacl1k’ dataset, which consists of 1,474 images with resolutions ranging from 245 × 336 to 5152 × 6000. It covers six label categories: cracks, efflorescence, spalling, exposed reinforcement, rust, and no damage, and includes 2,367 bounding box annotations.
In 2021, study19 introduced the ‘RDD2020’ (Road Damage Detection 2020) dataset, a large-scale road damage object detection dataset designed to support the application of deep learning methods in automatic road inspection and evaluation. The dataset contains a total of 26,336 images with resolutions of 600 × 600 and 720 × 720, featuring real-world road scenes from countries such as Japan, India, and the Czech Republic. More than 31,000 defect instances are annotated, covering categories including longitudinal cracks (D00), transverse cracks (D10), alligator cracks (D20), and potholes (D40).
In the same year, study20 created the ‘COCO-Bridge-2021’ dataset, a small-scale object detection dataset focused on identifying critical bridge components. It is designed to locate bridge details that are prone to fatigue damage or require prioritized inspection. The dataset contains 774 images with a resolution of 300 × 300, and includes 2,483 annotated object instances covering four typical components: bearings, girder ends, gusset plate nodes, and stiffeners. The images were collected from real bridge inspection scenarios and are suitable for mainstream detection algorithms such as SSD and YOLO. Building on this, study21 released an extended version named ‘COCO-Bridge-2021 + ‘ in the same year. This enhanced dataset aims to provide richer and more diverse training samples. It consists of 1,470 images, also with a resolution of 300 × 300, and contains 7,283 annotated structural instances, including 1,969 bearings, 335 girder connections, 1,083 gusset plate nodes, and 3,896 stiffeners. All annotations were precisely completed by structural experts based on actual inspection requirements, making the dataset suitable for visual analysis tasks such as bridge component recognition and structural condition assessment.
Although the aforementioned datasets provide a certain foundation for research in bridge defect detection, their application in object detection tasks still faces several limitations. First, the overall scale of these datasets is relatively small, with limited numbers of images, which makes it difficult to meet the demands of deep learning models for large-scale, high-quality samples. For example, the CSD dataset contains only 1,028 images, and while SDNET2018 expands to 56,092 images through cropping, it is originally based on just 230 images. Most images are taken from local perspectives, offering limited information density. Similarly, the GAPS and CSSC datasets primarily generate small sub-images through sliding window cropping, which restricts image detail and fails to provide sufficient contextual information, thereby limiting the generalization ability of models.
Second, most datasets lack fine-grained defect annotations and are suitable only for image-level classification tasks, falling short of the precision and localization requirements needed for object detection models. For instance, datasets such as GAPS, CSSC, MCDS, and BiNet include various typical bridge defects like cracks, spalling, corrosion, and exposed reinforcement, but they only provide image-level labels without specifying the exact defect regions. This coarse annotation approach limits the model’s localization performance in defect recognition and hinders subsequent damage assessment and quantitative analysis.
Even datasets that provide bounding box annotations, such as CODEBRIM and dacl1k, still suffer from issues like insufficient annotation granularity, limited defect type coverage, and inconsistent image resolution. For example, CODEBRIM’s annotations mainly focus on common defect types such as cracks, spalling, and exposed reinforcement, and the images are mostly collected from a single bridge environment, lacking diverse scene conditions, which hinders the model’s adaptability to complex environments. Although the dacl1k dataset includes six types of defect annotations, the wide variation in image resolutions and inconsistent shooting standards lead to uneven training sample quality, which affects the stability and robustness of detection models. In addition, while COCO-Bridge-2021 and its enhanced version COCO-Bridge-2021+ offer object detection annotations for detailed bridge components (such as bearings, gusset plates, and stiffeners), their focus is on structural components rather than specific damage types. Therefore, they are not directly applicable to surface defect detection tasks such as cracks, spalling, seepage, and exposed reinforcement. Furthermore, the image count in this series of datasets remains relatively limited, and the capture conditions are homogeneous, lacking diversity in lighting, viewpoints, and backgrounds—factors that are critical for improving a model’s generalization ability in real-world complex scenarios.
Therefore, this paper creates the GYU-DET dataset, aiming to address the shortcomings of existing datasets. The GYU-DET dataset includes a large number of high-resolution images, featuring six types of bridge defects: cracks, spalling, seepage, honeycombed surfaces, exposed rebar, and holes. The data is collected under various lighting and weather conditions to ensure diversity and complexity. Furthermore, the dataset provides precise bounding box annotations and detailed records of defect details, greatly improving the model’s applicability and detection accuracy in object detection tasks. Through GYU-DET, research in bridge defect detection will gain more comprehensive and accurate data support, enhancing the model’s practicality and robustness.
The data in this paper captures the complexity and diversity of bridge surface defects through collection under various lighting and environmental conditions. The data collection lasted 18 months (from April 2015 to December 2016), with over 18,000 raw images collected. After strict screening, annotation, and processing, a total of 11,123 clear and feature-distinct images were retained. These images cover multiple key parts of the bridge, including piers, main beams, railings, supports, etc., ensuring the representativeness and completeness of the data. Additionally, the dataset covers various lighting conditions (adequate lighting, low light, dim light) and weather conditions (rainy days, distant views, debris interference, etc.), providing rich training data for the algorithm’s robustness and generalization ability. The data annotation in this paper strictly follows the Chinese Road and Bridge Maintenance Standards22, classifying bridge surface defects into six categories: cracks, spalling, honeycombed surfaces, exposed rebar, seepage, and holes. To address the challenges encountered in the annotation process (such as ambiguous boundaries between defect types, coexistence of multiple defects, and difficulty in annotating irregular shapes), this paper established detailed annotation guidelines to ensure objectivity and consistency in the annotation process. The data annotation used the YOLO format to facilitate training and validation for deep learning object detection tasks.
Testing the GYU-DET dataset with the YOLOv11n model shows that the detection performance is better for obvious defect categories such as breakage, exposed rebar, and holes. For defects with blurred boundaries and irregular shapes, such as cracks and seepage, the detection performance is also relatively good. This indicates that the proposed GYU-DET dataset not only excels in annotation accuracy and consistency but also effectively supports automatic annotation by object detection models in bridge defect detection tasks, reducing the workload of manual annotation while ensuring high-quality annotation results.
In practical terms, GYU-DET can be used as a training dataset for developing lightweight detection models integrated into mobile inspection tools, such as handheld devices or UAV-based inspection platforms, which are increasingly used by municipal road maintenance teams. It also serves as a benchmark for researchers testing new algorithms under varied lighting and environmental conditions. Furthermore, the dataset supports educational purposes, including use in civil engineering and computer vision courses to train students on real-world infrastructure inspection scenarios. Because the dataset follows a standardized annotation format (YOLO), it is easily integrated into existing deep learning pipelines, allowing engineers and researchers to focus on model innovation and defect pattern analysis. These practical applications demonstrate how GYU-DET contributes to current efforts in semi-automated bridge inspection and AI-driven infrastructure maintenance.






