Cable Conduit Defect Recognition Algorithm Based on Improved YOLOv8

Kong, Fanfang; Zhang, Yi; Zhan, Lulin; He, Yuling; Zheng, Hai; Dai, Derui

doi:10.3390/electronics13132427

Open AccessArticle

Cable Conduit Defect Recognition Algorithm Based on Improved YOLOv8

by

Fanfang Kong

¹,

Yi Zhang

¹,

Lulin Zhan

²,

Yuling He

^3,*

,

Hai Zheng

³ and

Derui Dai

³

¹

Wenzhou Power Supply Company, State Grid Zhejiang Electric Power Co., Ltd., Wenzhou 325000, China

²

Wenzhou Power Construction Co., Ltd., Wenzhou 325000, China

³

Hebei Engineering Research Center for Advanced Manufacturing & Intelligent Operation and Maintenance of Electric Power Machinery, North China Electric Power University, Baoding 071003, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(13), 2427; https://doi.org/10.3390/electronics13132427

Submission received: 1 June 2024 / Revised: 19 June 2024 / Accepted: 20 June 2024 / Published: 21 June 2024

(This article belongs to the Special Issue Image and Video Processing Based on Deep Learning)

Download

Browse Figures

Versions Notes

Abstract

:

The underground cable conduit system, a vital component of urban power transmission and distribution infrastructure, faces challenges in maintenance and residue detection. Traditional detection methods, such as Closed-Circuit Television (CCTV), rely heavily on the expertise and prior experience of professional inspectors, leading to time-consuming and subjective results acquisition. To address these issues and automate defect detection in underground cable conduits, this paper proposes a defect recognition algorithm based on an enhanced YOLOv8 model. Firstly, we replace the Spatial Pyramid Pooling (SPPF) module in the original model with the Atrous Spatial Pyramid Pooling (ASPP) module to capture multi-scale defect features effectively. Secondly, to enhance feature representation and reduce noise interference, we integrate the Convolutional Block Attention Module (CBAM) into the detection head. Finally, we enhance the YOLOv8 backbone network by replacing the C2f module with the base module of ShuffleNet V2, reducing the number of model parameters and optimizing the model efficiency. Experimental results demonstrate the efficacy of the proposed algorithm in recognizing pipe misalignment and residual foreign objects. The precision and mean average precision (mAP) reach 96.2% and 97.6%, respectively, representing improvements over the original YOLOv8 model. This study significantly improves the capability of capturing and characterizing defect characteristics, thereby enhancing the maintenance efficiency and accuracy of underground cable conduit systems.

Keywords:

underground cable conduit; defect identification; target detection; YOLOv8

1. Introduction

High-voltage cable conduit installation has seen widespread adoption in cable routes due to its benefits of easy construction, cost-effectiveness, and minimal impact on subsequent maintenance [1,2].

However, underground cable conduits often encounter issues such as misalignment and the accumulation of gravel and other foreign materials during construction. The process of dragging cables during installation poses a risk to the integrity of the cable insulation layer, potentially leading to underground accidents during grid operation. To address these challenges, pipeline robots have been developed and deployed for defect detection within the conduits. Currently, the prevailing method for defect detection involves personnel manually inspecting pipeline interiors by capturing images with these robots. Consequently, the automation level of defect identification remains inadequate. There is an urgent need to develop an efficient and reliable pipeline defect identification algorithm. Given that cable conduits are typically narrow, the hardware used for inspection is limited in capacity. Therefore, it is crucial to optimize the algorithm model to reduce its complexity while maintaining high accuracy, ensuring it can be deployed effectively within the hardware constraints.

In earlier research, some scholars employed traditional machine learning techniques, such as those based on morphological, geometric, and surface texture features, to detect and diagnose defects [3,4,5]. With the advancement of science and technology, machine learning has profoundly influenced the fields of experimental solid mechanics and industrial surface defect detection, driving the progress of related technologies [6,7]. However, in recent years, rapid advancements in computer vision and artificial intelligence have led to the emergence of deep learning-based image recognition methods, which have proven to be potent tools for surface defect detection and improving detection processes [8].

For instance, Kumar et al. [9] utilized convolutional neural networks (CNNs) for defect recognition in underground drainage pipes, achieving an average testing precision of 86.2%. Qi Li et al. [10] employed a CNN with variously sized convolutional kernels and pooling layers to classify and recognize a two-dimensional matrix converted from a time series, achieving a precision of 98.67%.

Among the plethora of target detection algorithms, the YOLO series of single-stage detection models has shown promising results in defect detection [11,12,13]. Lv et al. [14] reduced the model size of YOLOv7 by replacing conventional convolutional blocks with lightweight modules and added attention mechanisms and SPD convolutional modules, demonstrating high performance in strip steel surface defect detection tasks. Xu et al. [15] enhanced YOLOv5 by integrating attention mechanisms, loss functions, and activation functions to improve small target detection, achieving a recognition precision of 92.2% for welding defects inside pipelines, which is 9% higher than the original model. Additionally, Yin et al. [16] proposed the VIASP defect identification algorithm based on the YOLOV3 algorithm, which can extract key information from the video to achieve automatic defect marking and output an evaluation report.

Compared to other target detection algorithms, YOLOv8 showcases exceptional detection performance and robust generalization capabilities. These attributes render it highly suitable for tackling intricate defect detection scenarios encountered in underground cable conduits. Several scholars have enhanced the performance of YOLOv8 across various tasks by modifying modules and refining the structure [17,18,19].

In this study, we propose a defect recognition algorithm tailored for real-world underground cable conduit scenarios based on an improved YOLOv8 model. This paper further enhances the backbone network and detection head components from the original model. The constructed cable conduit dataset is utilized for both training and testing purposes. Experimental results demonstrate the model’s effectiveness in detecting misalignment and foreign object defects in cable conduits. The main improvements are summarized as follows:

(1): Underground cable conduits exhibit large-scale differences in defects. To address this, we employ the Atrous Spatial Pyramid Pooling (ASPP) module to replace the original Spatial Pyramid Pooling (SPPF) module. This strengthens the model’s ability to extract features across different scales, thereby improving its capability of detecting multi-scale targets.
(2): Given the low-light conditions during video acquisition in underground cable conduits and the high noise levels in collected data due to the narrow and unstable environment, we incorporate the Convolutional Block Attention Module (CBAM) mechanism. This mechanism mitigates noise interference, enabling the model to focus more on key pipeline defect areas, thereby enhancing feature extraction and learning capabilities.
(3): To mitigate the increase in model parameters resulting from the aforementioned enhancements and facilitate easier deployment, we replace the C2f module in the backbone network with the basic module of ShuffleNet V2. This reduction in model parameters does not significantly impact detection precision, making the model easier to deploy.

2. Related Work

2.1. YOLOv8 Algorithm

As part of the YOLO series [20], the YOLOv8 target detection network enhances accuracy, efficiency, and robustness compared to its predecessors. The YOLOv8 network architecture comprises four components: the Input layer (Input), the Backbone network (Backbone), the feature fusion layer (Neck), and the Detection layer (Head) [21].

The Input layer preprocesses the image, ensuring it matches the input layer dimensions of the model by adjusting it to a fixed size. The Backbone network is tasked with extracting semantic and spatial information features from the input image, forwarding these features to the subsequent detection head for target detection. The feature fusion layer incorporates the C2f module, the upsample layer, and the Concat module, which fuses feature maps of different scales to form a better feature representation to improve the performance of the model. In the Detection layer, the Decoupled Head structure (Decou-Head) separates classification and detection tasks, employing distinct loss functions tailored to each task. Additionally, Anchor-Free techniques are utilized in the sample matching process, eliminating the need for anchor boxes to determine positive and negative samples more efficiently, thus enhancing model detection speed.

2.2. Improve YOLOv8 Network Model Construction

In order to effectively extract features from defects of varying scales within the complex underground cable piping system, this paper employs a hollow-space convolutional pooled pyramid. This approach expands the receptive field to capture multi-scale features more comprehensively, leveraging global information to enhance model accuracy with only a marginal increase in computational overhead.

Additionally, the CBAM attention mechanism is integrated into the detection head to enhance feature extraction from both channel and spatial dimensions, mitigating external noise interference and improving model generalization.

To enable real-time and accurate identification of cable conduit obstacles for timely cleanup by relevant authorities, it is imperative to reduce model complexity and computational overhead during runtime. The YOLOv8 model’s introduction of the C2f module, along with the incorporation of ASPP and CBAM attention mechanisms, inevitably increases computational demands. Thus, this paper proposes enhancing the YOLOv8 backbone network by adopting base modules from the ShuffleNet V2 [22] architecture to reduce model parameters and expedite recognition.

Subsequent subsections in this section will delve into the working principles and technical intricacies of each module. The structure of the improved model network is illustrated in Figure 1.

2.2.1. Atrous Spatial Pyramid Pooling

The Atrous Spatial Pyramid Pooling (ASPP) module, originally designed for image semantic segmentation tasks, is employed in this paper to enhance the target detection capabilities by replacing the Spatial Pyramid Pooling (SPP) module in YOLOv8. Unlike traditional pooling operations, ASPP increases the receptive field without downsample, thereby effectively improving the model’s ability to detect and recognize targets.

ASPP conducts multi-scale convolutional operations on input feature maps using convolution kernels with varying dilation rates, merging information from different scales. This approach enhances the network’s capacity to perceive targets and comprehend semantics, thereby improving the model’s ability to detect targets across various scales.

The decision to replace the SPP module with ASPP in YOLOv8 is primarily motivated by ASPP’s advantage in capturing multi-scale information, which aligns well with the complex scenarios encountered in underground cable conduit defects. This adaptability ensures improved accuracy and robustness in detecting targets of different scales.

The ASPP module, illustrated in Figure 2, initiates by applying multi-scale atrous convolution to the input feature map. By defining different dilation rates “R”, it facilitates free multi-scale feature extraction, enabling the model to concurrently consider both small- and large-scale characteristics. Subsequently, a global pooling operation is executed on the input feature map to capture its global information. Following the acquisition of features at each scale, a concatenation (Concat) operation is performed on them along the channel dimension to generate a more comprehensive feature representation. To diminish feature dimensionality and reduce computational load, a pointwise convolution is employed to conduct dimensionality reduction on the merged features, ultimately yielding the final feature map.

2.2.2. Convolutional Block Attention Module

CBAM (Convolutional Block Attention Module) is a convolutional neural network attention mechanism that integrates both channel attention and spatial attention. This module dynamically learns channel and spatial information within input feature maps to enhance network performance. Compared to other attention mechanisms, CBAM often yields superior results. The realization flow of CBAM is illustrated in Figure 3.

The CBAM process can be delineated into two primary steps, as shown in Figure 3. Firstly, in the channel attention module, the input feature map undergoes global average pooling along the channel dimension. Subsequently, it passes through two fully connected layers to discern the correlation and significance between channels individually. Secondly, in the spatial attention module, the input feature map undergoes max-pooling along the channel dimension. This is followed by processing through two convolutional layers to ascertain the importance of different spatial positions within the feature map. Ultimately, the attention weights derived from both the channel attention module and the spatial attention module are applied to the input feature map separately. The final feature representation is then obtained through fusion via element-wise multiplication.

The channel attention module can be expressed as

\begin{matrix} M_{c} (F) & = δ (M L P (A v g P o o l (F)) + M L O (M a x P o o l (F))) \\ = δ (W_{1} (W_{0} (F_{a v g}^{c})) + W_{1} (W_{0} (F_{\max}^{c}))) \end{matrix}

(1)

where σ represents the activation function, F is the feature map, W₀ and W₁ represent two convolution operations, and

F_{a v g}^{c}

and

F_{m a x}^{c}

represent average pooling and maximum pooling, respectively.

The spatial attention module can be written as

\begin{matrix} M_{s} (F) & = δ (F^{7 \times 7} ([A v g P o o l (F); M a x P o o l (F)])) \\ = δ (F^{7 \times 7} ([F_{a v g}^{s}; F_{\max}^{s}])) \end{matrix}

(2)

where F^7×7 represents the convolution operation with a convolution kernel size of 7 × 7.

Ultimately, the outputs of the channel attention module and the spatial attention module are multiplied to yield a weighted feature map, as depicted in Equation (3). This operation enables the network to prioritize essential channels and regions more effectively, enhancing its focus on critical aspects of the data.

\begin{array}{l} F^{'} = M_{c} (F) \otimes F \\ F^{″} = M_{s} (F^{'}) \otimes F^{'} \end{array}

(3)

where F′ and F″ represent the output feature maps after channel attention and spatial attention, respectively, and

\otimes

represents element-wise multiplication.

This paper introduces the CBAM attention mechanism into the detection head to enhance the model’s capability to discern and localize defects within underground cable conduits. By incorporating this mechanism, the model can concentrate more effectively on the defect area, thereby enhancing sensitivity in detecting small defects. Additionally, it helps to suppress background noise and decrease the false detection rate, leading to reliable and stable defect recognition in the intricate underground cable conduit environment.

2.2.3. ShuffleNet

To achieve model lightweighting, this paper replaces the C2f module in the YOLOv8 backbone network with the base module from ShuffleNet V2. ShuffleNet V2 is a specialized convolutional neural network architecture crafted for efficient computation and compression of model parameters. Its fundamental concept revolves around reducing computational complexity and model size by leveraging depth-separable convolution and channel blending techniques.

Depth-separable convolution decomposes the convolution operation into two sequential steps: depth convolution and pointwise convolution. This decomposition significantly reduces the number of parameters and computational costs involved. Conversely, channel blending enhances the model’s expressive power and performance by grouping input channels and recombining them after convolution within the group. This process facilitates cross-channel information exchange and feature reorganization, contributing to improved model performance.

These design principles enable the model to maintain high accuracy while exhibiting a smaller model size and faster inference speed. This characteristic makes ShuffleNet V2 suitable for resource-constrained environments and mobile deployments.

The basic module of ShuffleNet V2, illustrated in Figure 4a, comprises two branches. In the left branch, a 3 × 3 depth convolution operation is performed, followed by a 1 × 1 point convolution operation on the input feature map. Conversely, the right branch conducts a depth convolution operation, along with two 1 × 1 point convolution operations. Subsequently, a concatenation (Concat) operation is conducted with the left branch in the channel dimension, followed by group convolution with channel shuffling.

The feature matrix, subsequent to the group convolution of the input feature map, undergoes further disruption and division. The resultant feature map obtained through channel shuffling effectively integrates information across different channels, as illustrated in Figure 4b.

3. Experimentation and Analysis

3.1. Data Collection

The cable conduit dataset utilized in this study originates from a test site constructed specifically for this purpose. The cable discharge pipes are fabricated from fiberglass material, featuring three distinct inner diameter specifications: 175 mm, 225 mm, and 250 mm. This variety aligns with real-world engineering application scenarios. Various misalignment conditions, foreign body positions, and light intensities were simulated during the placement process, ensuring that the ratio of misalignment images to foreign body images remained balanced. In total, 510 images were collected for the dataset.

3.2. Dataset Construction

Considering the limited quantity of original data collected and aiming to bolster the robustness of the enhancement algorithm, image enhancement techniques are applied to the original dataset.

The data augmentation process includes several techniques applied with specific probabilities: vertical and horizontal flipping occurs with a 50% probability; brightness adjustment randomly varies between 80% and 120% with a 70% probability; random grid rearrangement divides the image into 3 × 3 grids and rearranges them with a 30% probability; color jittering adjusts contrast and saturation between 80% and 120% with a 20% probability; and piecewise affine transformation distorts the image with a 10% probability, mimicking real-world image distortions encountered in practice, and the cable conduit dataset is expanded to encompass 1145 images. The techniques employed in the data enhancement process are uniformly applied to all images, ensuring no particular bias towards any specific type of defect image. Consequently, the proportion of foreign matter and misalignment defects remains approximately equal in the enhanced dataset. The efficacy of these data enhancement techniques is illustrated in Figure 5.

The enhanced dataset is annotated using the LabelImg image annotation software. For each annotated image, a corresponding text file is generated, containing information about the types of targets present in the image along with their bounding box positions and sizes. A total of 1912 valid object labels are obtained through annotation, comprising 724 contaminant labels and 1188 misalignment labels.

The labeled images are then randomly divided into training and validation sets at a ratio of 9:1 for model training and validation purposes. Additionally, a separate test set comprising 80 unlabeled images is collected. Together, these datasets constitute the cable discharge pipe defect recognition dataset, as outlined in Table 1.

3.3. Experimental Deployment Environment

The model utilized in this paper is built upon the PyTorch deep learning framework. The hardware and software environments for conducting the experiments are as follows: Windows 10 operating system, 13th Gen Intel(R) Core(TM) i5-13600KF @3.5GHz CPU, RTX 4070 12G GPU, PyTorch version 2.1.2, and CUDA version 11.8.

3.4. Evaluation Metrics

To quantitatively assess the model’s performance in this paper, three common target detection evaluation metrics, precision, recall, and mean average precision (mAP), are employed.

As illustrated in Figure 6, true positive (TP) represents the number of samples predicted to be positive cases that are indeed positive cases; false positive (FP) denotes the number of samples predicted to be positive cases that are, in reality, negative cases; true negative (TN) signifies the number of samples predicted to be negative cases that are indeed negative cases; and false negative (FN) indicates the number of samples predicted as negative cases that are, in fact, positive cases.

Precision, denoted as the ratio of correct predictions to all positive detections, including false positives (FPs) and true positives (TPs), serves as a measure of the model’s precision in the detection task, as depicted in Equation (4).

Precision = \frac{T P}{T P + F P}

(4)

Recall, defined as the ratio of correct predictions to all samples, quantifies the model’s capability to identify all actual defect samples, reflecting its ability to detect real defects accurately. A higher recall rate signifies greater search comprehensiveness of the model. The calculation is as follows:

Re call = \frac{T P}{T P + F N}

(5)

The mAP is calculated by the precision and recall rate, as shown in Equations (6) and (7).

AP = \int_{0}^{1} P (r) d r

(6)

mAP = \frac{\sum_{i = 1}^{n} A P (i)}{n}

(7)

Among them, P(r) is the precision, and n is the number of target types.

3.5. Model Training and Prediction

Once the platform construction and organization of the cable discharge pipe defect dataset were completed, formal model training commenced. Considering the computational platform’s capabilities, the training iteration was set to 200 times. Following training, the model’s performance was evaluated by processing and predicting the test set. Throughout the training process, the loss value exhibited a consistent decrease with increasing iterations. Convergence was determined when the validation loss value ceased to decrease, as depicted in Figure 7. The performance of the model is more comprehensively evaluated through the P-R curve and confusion matrix, as shown in Figure 8 and Figure 9. The Precision–Recall (P-R) diagram is used to illustrate and assess the trade-off between the precision and recall of the model at various thresholds, effectively reflecting the model’s performance across different confidence levels. The confusion matrix provides a detailed evaluation of the model’s classification performance by analyzing specific misclassifications using four indicators: true positive, true negative, false positive, and false negative.

The loss value curve depicted in Figure 7 illustrates a rapid decline in the model’s loss value within the initial 40 epochs, followed by stabilization after 190 epochs. This smooth decrease in the overall loss value curve indicates the model’s strong convergence performance. As shown in Figure 8 and Figure 9, the improved model demonstrates excellent performance in detecting misalignments and contaminants in cable conduits.

Furthermore, Figure 10 visualizes the prediction results of different models on various test sets. Marked sections in the figure indicate defect category, defect location, and prediction confidence. Defect locations are outlined with boxes of varying colors, followed by their categorization and corresponding confidence levels. These results highlight the improved model’s superior overall prediction efficacy.

3.6. Ablation Experiment Analysis

In this study, enhancements are made to both the backbone network and the detection head component of the original model. To validate the effectiveness of these improvements, ablation experiments are conducted on the three enhancement schemes proposed in this paper. The experimental results detailing the impact of different enhancement strategies on the model’s performance are presented in Table 2.

As shown in Table 2, the incorporation of the ASPP in the YOLOv8 model leads to an increase in mean average precision by 1.1%, 1.3%, and 0.7% compared to the original model, YOLOv8-Shuffle, and YOLOv8-CBAM, respectively. This enhancement underscores the efficacy of the hollow-space convolutional pooling pyramid in augmenting the model’s feature extraction capacity for multi-scale targets, thereby improving defect detection performance. Furthermore, the integration of the CBAM attention mechanism into the detection head enhances the model’s ability to discern defects within the pipeline by effectively suppressing noise and irrelevant environmental information. Simultaneously, the adoption of lightweighting techniques in the backbone network, specifically replacing C2f with the Shuffle V2 base module, resulted in a significant reduction in model size, indicating the practicality of such modifications and facilitating easier deployment of the model. By implementing these three improvements concurrently, the model’s performance surpasses that of the original model across all metrics, achieving the highest mean average precision of 97.6%.

These experimental findings underscore the effectiveness of the enhanced underground cable conduit defect detection algorithm, offering substantial advancements in detection capabilities. This provides robust support for the realization of more accurate and efficient cable conduit inspections.

3.7. Comparisons of Different Attention Mechanism Modules

In order to explore which attention mechanism can provide the best detection performance in this study, we used YOLOv8 as the benchmark network and inserted SE, CA, and CBAM attention mechanism modules in the same location for comparison. The detection results of each module on the cable conduit defect dataset are presented in Table 3. The comparison results indicate that the detection network incorporating the CBAM module exhibits the highest performance in identifying cable conduit defects, achieving an AP value of 94.2%. The network using the CA module shows moderate overall performance in defect detection, though it increases the number of model parameters. The network utilizing the SE module displays minimal improvement in detection performance compared to the original YOLOv8. This clearly demonstrates that integrating the attention mechanism module into the YOLOv8 network is an effective solution for defect detection. This is because SE focuses solely on the channel dimension’s attention and lacks spatial dimension feature information. The CA attention mechanism calculates the attention weight across the entire feature map, resulting in significant computational overhead. In contrast, the CBAM attention mechanism enhances the model’s ability to capture crucial features by simultaneously modeling channel and spatial attention. This approach maintains the integrity of the feature map’s positional and spatial information while effectively capturing the positional information of defect features. Consequently, CBAM enables the model to accurately identify defects such as dislocations and foreign objects in the cable conduit, thereby enhancing overall performance.

3.8. Comparison of Detection Capabilities of Different Models

To further validate the superiority of our model, this paper trained and compared the cable conduit defect dataset using the traditional convolutional neural network (Fast R-CNN) algorithm, the YOLOv5 algorithm, and the original YOLOv8 algorithm. The comparative evaluation primarily includes average precision (AP), mean average precision (mAP), and model inference speed (FPS), as presented in Table 4.

Analysis of the results reveals that our improved model surpasses the original models of Faster R-CNN, YOLOv5, and YOLOv8 in both AP and mAP metrics while also exhibiting higher FPS, enabling efficient completion of the recognition task. In conclusion, our enhanced model proves to be highly effective and outperforms the other three algorithm recognition models in identifying defects in cable conduits.

4. Conclusions

Timely detection of recently installed cable conduits is crucial to prevent misalignment and the accumulation of foreign matter, which can lead to cable breakage and subsequent safety hazards.

Traditional detection methods reliant on manual image inspection are inefficient and prone to subjective interpretation; hence, this paper proposes an enhanced algorithm utilizing YOLOv8 for identifying defects in underground cable conduits. The algorithm integrates three key improvements: the ASPP convolution pyramid, the CBAM attention mechanism, and the Shuffle-Net lightweight module to train the model for automated defect detection in urban cable infrastructure. Experimental results demonstrate the efficacy of the proposed model, achieving a mean average precision of 97.6% on the dataset utilized.

The model’s ability to perform real-time video detection facilitates its practical application in real-world scenarios, offering efficient and precise identification and localization of pipeline defects without relying on manual labor. Nonetheless, the limited scope of the dataset used in this study highlights the need for future research to expand and enhance both the quality and quantity of data, thereby improving the model’s generalizability.

Author Contributions

Conceptualization, F.K., Y.H. and Y.Z.; methodology, F.K.; software, F.K.; validation, F.K., Y.Z. and L.Z.; formal analysis, F.K.; investigation, Y.Z.; resources, Y.Z.; data curation, Y.H.; writing—original draft preparation, F.K., H.Z. and D.D.; writing—review and editing, F.K. and L.Z.; visualization, H.Z.; supervision, Y.Z.; project administration, Y.Z.; funding acquisition, F.K. and Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Wenzhou Tusheng Holding Group Co., Ltd. Science and Technology project (CF058807002022007).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

Author Fanfang Kong and Yi Zhang are employed by the company Wenzhou Power Supply Company, State Grid Zhejiang Electric Power Co., Ltd., Lulin Zhan is employed by the company Wenzhou Power Construction Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Ruan, J.-J.; Liu, C.; Huang, D.-C.; Zhan, Q.-H.; Tang, L.-Z. Hot spot temperature inversion for the single-core power cable joint. Appl. Therm. Eng. 2016, 104, 146–152. [Google Scholar] [CrossRef]
Maximov, S.; Venegas, V.; Guardado, J.L.; Moreno, E.L.; López, R. Analysis of underground cable ampacity considering non-uniform soil temperature distributions. Electr. Power Syst. Res. 2016, 132, 22–29. [Google Scholar] [CrossRef]
Sinha, S.K.; Fieguth, P.W. Segmentation of buried concrete pipe images. Autom. Constr. 2006, 15, 47–57. [Google Scholar] [CrossRef]
Su, T.-C.; Yang, M.-D. Application of Morphological Segmentation to Leaking Defect Detection in Sewer Pipelines. Sensors 2014, 14, 8686–8704. [Google Scholar] [CrossRef] [PubMed]
Hawari, A.; Alamin, M.; Alkadour, F.; Elmasry, M.; Zayed, T. Automated defect detection tool for closed circuit television (cctv) inspected sewer pipelines. Autom. Constr. 2018, 89, 99–109. [Google Scholar] [CrossRef]
Jin, H.; Zhang, E.; Espinosa, H.D. Recent Advances and Applications of Machine Learning in Experimental Solid Mechanics: A Review. Appl. Mech. Rev. 2023, 75, 061001. [Google Scholar] [CrossRef]
Chen, Y.; Ding, Y.; Zhao, F.; Zhang, E.; Wu, Z.; Shao, L. Surface Defect Detection Methods for Industrial Products: A Review. Appl. Sci. 2021, 11, 7657. [Google Scholar] [CrossRef]
Ameri, R.; Hsu, C.-C.; Band, S.S. A systematic review of deep learning approaches for surface defect detection in industrial applications. Eng. Appl. Artif. Intell. 2024, 130, 107717. [Google Scholar] [CrossRef]
Kumar, S.S.; Abraham, D.M.; Jahanshahi, M.R.; Iseley, T.; Starr, J. Automated defect classification in sewer closed circuit television inspections using deep convolutional neural networks. Autom. Constr. 2018, 91, 273–283. [Google Scholar] [CrossRef]
Li, Q.; Shi, Y.; Lin, R.; Qiao, W.; Ba, W. A novel oil pipeline leakage detection method based on the sparrow search algorithm and CNN. Measurement 2022, 204, 112–122. [Google Scholar] [CrossRef]
Ren, F.; Fei, J.; Li, H.; Doma, B.T. Steel Surface Defect Detection Using Improved Deep Learning Algorithm: ECA-SimSPPF-SIoU-Yolov5. IEEE Access 2024, 12, 32545–32553. [Google Scholar] [CrossRef]
Lv, H.; Zhang, H.; Wang, M.; Xu, J.; Li, X.; Liu, C. Hyperspectral Imaging Based Nonwoven Fabric Defect Detection Method Using LL-YOLOv5. IEEE Access 2024, 12, 41988–41998. [Google Scholar] [CrossRef]
Guan, S.; Wang, X.; Wang, J.; Yu, Z.; Wang, X.; Zhang, C.; Liu, T.; Liu, D.; Wang, J.; Zhang, L. Ceramic ring defect detection based on improved YOLOv5. In Proceedings of the 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA), Changchun, China, 20–22 May 2022; pp. 115–118. [Google Scholar]
Lv, B.; Duan, B.; Zhang, Y.; Li, S.; Wei, F.; Gong, S.; Ma, Q.; Cai, M. Research on Surface Defect Detection of Strip Steel Based on Improved YOLOv7. Sensors 2024, 24, 2667. [Google Scholar] [CrossRef] [PubMed]
Xu, L.; Dong, S.; Wei, H.; Ren, Q.; Huang, J.; Liu, J. Defect signal intelligent recognition of weld radiographs based on YOLO V5-IMPROVEMENT. J. Manuf. Process. 2023, 99, 373–381. [Google Scholar] [CrossRef]
Yin, X.; Ma, T.; Bouferguene, A. Automation for sewer pipe assessment: CCTV video interpretation algorithm and sewer pipe video assessment (SPVA) system developmen. Autom. Constr. 2021, 125, 103622. [Google Scholar] [CrossRef]
Ye, R.; Shao, G.; He, Y.; Gao, Q.; Li, T. YOLOv8-RMDA: Lightweight YOLOv8 Network for Early Detection of Small Target Diseases in Tea. Sensors 2024, 24, 2896. [Google Scholar] [CrossRef] [PubMed]
Nie, H.; Pang, H.; Ma, M.; Zheng, R. A Lightweight Remote Sensing Small Target Image Detection Algorithm Based on Improved YOLOv8. Sensors 2024, 24, 2952. [Google Scholar] [CrossRef]
Jiang, X.; Zhuang, X.; Chen, J.; Zhang, J.; Zhang, Y. YOLOv8-MU: An Improved YOLOv8 Underwater Detector Based on a Large Kernel Block and a Multi-Branch Reparameterization Module. Sensors 2024, 24, 2905. [Google Scholar] [CrossRef] [PubMed]
Redmon, J.; Divala, S.; Girshick, R. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Zhang, Y.; Wu, Z.; Wang, X.; Fu, W.; Ma, J.; Wang, G. Improved YOLOv8 Insulator Fault Detection Algorithm Based on BiFormer. In Proceedings of the 2023 IEEE 5th International Conference on Power, Intelligent Computing and Systems (ICPICS), Shenyang, China, 14–16 July 2023; pp. 962–965. [Google Scholar]
Ma, N.; Zhang, X.; Zheng, H.-T.; Sun, J. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 116–131. [Google Scholar]

Figure 1. The model architecture after improvement in this paper.

Figure 2. Detailed overview of the ASPP (Atrous Spatial Pyramid Pooling) architecture.

Figure 3. Schematic diagram of CBAM attention mechanism.

Figure 4. The ShuffleNet V2 and channel shuffle schematic diagram. (a) Basic module; (b) channel shuffle.

Figure 5. Data augmentation.

Figure 6. Evaluation index confusion matrix.

Figure 7. Loss value change curve during training process.

Figure 8. P-R curve of the improved YOLOv8.

Figure 9. Confusion matrix of the improved YOLOv8.

Figure 10. Detection visualization results on the cable conduit dataset: (a) Faster R-CNN; (b) YOLOv5; (c) YOLOv8; (d) Improved YOLOv8.

Table 1. Cable conduit defect identification data set.

Dataset	Number of Images	Contains Tags
Dataset	Number of Images	Contaminants	Misalignment
Training set	1030	650	1072
Validation set	115	74	116
Test set	80	-	-

Table 2. Comparative results of ablation experiments.

Model	Average Precision (AP)	Mean Average Precision (MAP)	Model Size/MB
YOLOv8	0.931	0.943	6.37
YOLOv8 + ASPP	0.946	0.954	7.21
YOLOv8 + ShuffleNet	0.930	0.941	4.89
YOLOv8 + CBAM	0.942	0.947	6.41
Improved YOLOv8	0.962	0.976	5.87

Table 3. Comparison of detection effects of different attention mechanism modules.

Model	Average Precision (AP)	Mean Average Precision (MAP)	Model Size/MB
YOLOv8	0.931	0.943	6.37
YOLOv8 + SE	0.934	0.944	6.49
YOLOv8 + CA	0.936	0.945	7.02
YOLOv8 + CBAM	0.942	0.947	6.41

Table 4. Different detection model results.

Model	Average Precision (AP)	Mean Average Precision (MAP)	FPS
Faster R-CNN	0.823	0.833	143.6
YOLOv5	0.935	0.942	160.5
YOLOv8	0.951	0.965	165.5
Improved YOLOv8	0.962	0.976	167.5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kong, F.; Zhang, Y.; Zhan, L.; He, Y.; Zheng, H.; Dai, D. Cable Conduit Defect Recognition Algorithm Based on Improved YOLOv8. Electronics 2024, 13, 2427. https://doi.org/10.3390/electronics13132427

AMA Style

Kong F, Zhang Y, Zhan L, He Y, Zheng H, Dai D. Cable Conduit Defect Recognition Algorithm Based on Improved YOLOv8. Electronics. 2024; 13(13):2427. https://doi.org/10.3390/electronics13132427

Chicago/Turabian Style

Kong, Fanfang, Yi Zhang, Lulin Zhan, Yuling He, Hai Zheng, and Derui Dai. 2024. "Cable Conduit Defect Recognition Algorithm Based on Improved YOLOv8" Electronics 13, no. 13: 2427. https://doi.org/10.3390/electronics13132427

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Cable Conduit Defect Recognition Algorithm Based on Improved YOLOv8

Abstract

1. Introduction

2. Related Work

2.1. YOLOv8 Algorithm

2.2. Improve YOLOv8 Network Model Construction

2.2.1. Atrous Spatial Pyramid Pooling

2.2.2. Convolutional Block Attention Module

2.2.3. ShuffleNet

3. Experimentation and Analysis

3.1. Data Collection

3.2. Dataset Construction

3.3. Experimental Deployment Environment

3.4. Evaluation Metrics

3.5. Model Training and Prediction

3.6. Ablation Experiment Analysis

3.7. Comparisons of Different Attention Mechanism Modules

3.8. Comparison of Detection Capabilities of Different Models

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI