Skip to main content

Research Repository

Advanced Search

A Transformer-Based Network With Feature Complementary Fusion for Crack Defect Detection

Ma, Mingyang; Yang, Lei; Liu, Yanhong; Yu, Hongnian

Authors

Mingyang Ma

Lei Yang

Yanhong Liu



Abstract

Pavement crack detection poses a formidable challenge due to the intricate texture structures of cracks and the complex environmental settings in which they are situated. In recent years, the advancement of deep learning techniques has prompted a surge in the utilization of Convolutional Neural Network (CNN)-based methods for pavement crack detection. While CNNs have exhibited remarkable results in crack detection tasks, they primarily excel at capturing local details with limited receptive fields, which can be insufficient for grasping global contextual information. Given the intricate nature of crack textures, it becomes imperative to leverage both global and local features for accurate detection. To address this issue, a transformer-based network with feature complementary fusion, refer to TFCF-Net, is introduced, which amalgamates Transformer and CNN architectures. Proposed TFCF-Net model prioritizes the Transformer branch for feature encoding, considering its strength in extracting global features, while the CNN branch is set as auxiliary encoding branch, which plays a complementary role for local feature extraction. Proposed TFCF-Net operates by utilizing global features as a foundation and iteratively refining them using local features, thus facilitating precise crack detection. This design enables proposed network to comprehensively capture both global and local information while judiciously fusing these two types of information based on the distinctive characteristics of cracks. To ensure effective fusion of global and local information, an Information Complementary Fusion (ICF) module is presented, which could efficiently merge the outputs of both encoding branches. To further optimize the fused information, a multi-dimensional attention (MA) module is proposed to embed into the, which enhances the model’s ability to capture long-range dependencies by optimizing information from multiple dimensions. Additionally, to improve the quality of input features on the decoding side, a multi-dimensional attention feature representation (MAFR) module is proposed, which expands the receptive field of the deepest semantic information, enabling the extraction of multi-scale feature representations. This paper rigorously evaluate proposed TFCF-Net against state-of-the-art (SOTA) models using three publicly available pavement crack datasets. Experimental results unequivocally demonstrate the superior performance of the proposed TFCF-Net.

Citation

Ma, M., Yang, L., Liu, Y., & Yu, H. (2024). A Transformer-Based Network With Feature Complementary Fusion for Crack Defect Detection. IEEE Transactions on Intelligent Transportation Systems, 25(11), 16989-17006. https://doi.org/10.1109/tits.2024.3421331

Journal Article Type Article
Acceptance Date Jun 26, 2024
Online Publication Date Jul 12, 2024
Publication Date 2024
Deposit Date Aug 8, 2024
Journal IEEE Transactions on Intelligent Transportation Systems
Print ISSN 1524-9050
Electronic ISSN 1558-0016
Publisher Institute of Electrical and Electronics Engineers
Peer Reviewed Peer Reviewed
Volume 25
Issue 11
Pages 16989-17006
DOI https://doi.org/10.1109/tits.2024.3421331
Keywords CNN, transformer, feature complementary fusion, crack detection