Mingyang Ma
A Transformer-Based Network With Feature Complementary Fusion for Crack Defect Detection
Ma, Mingyang; Yang, Lei; Liu, Yanhong; Yu, Hongnian
Abstract
Pavement crack detection poses a formidable challenge due to the intricate texture structures of cracks and the complex environmental settings in which they are situated. In recent years, the advancement of deep learning techniques has prompted a surge in the utilization of Convolutional Neural Network (CNN)-based methods for pavement crack detection. While CNNs have exhibited remarkable results in crack detection tasks, they primarily excel at capturing local details with limited receptive fields, which can be insufficient for grasping global contextual information. Given the intricate nature of crack textures, it becomes imperative to leverage both global and local features for accurate detection. To address this issue, a transformer-based network with feature complementary fusion, refer to TFCF-Net, is introduced, which amalgamates Transformer and CNN architectures. Proposed TFCF-Net model prioritizes the Transformer branch for feature encoding, considering its strength in extracting global features, while the CNN branch is set as auxiliary encoding branch, which plays a complementary role for local feature extraction. Proposed TFCF-Net operates by utilizing global features as a foundation and iteratively refining them using local features, thus facilitating precise crack detection. This design enables proposed network to comprehensively capture both global and local information while judiciously fusing these two types of information based on the distinctive characteristics of cracks. To ensure effective fusion of global and local information, an Information Complementary Fusion (ICF) module is presented, which could efficiently merge the outputs of both encoding branches. To further optimize the fused information, a multi-dimensional attention (MA) module is proposed to embed into the, which enhances the model’s ability to capture long-range dependencies by optimizing information from multiple dimensions. Additionally, to improve the quality of input features on the decoding side, a multi-dimensional attention feature representation (MAFR) module is proposed, which expands the receptive field of the deepest semantic information, enabling the extraction of multi-scale feature representations. This paper rigorously evaluate proposed TFCF-Net against state-of-the-art (SOTA) models using three publicly available pavement crack datasets. Experimental results unequivocally demonstrate the superior performance of the proposed TFCF-Net.
Citation
Ma, M., Yang, L., Liu, Y., & Yu, H. (2024). A Transformer-Based Network With Feature Complementary Fusion for Crack Defect Detection. IEEE Transactions on Intelligent Transportation Systems, 25(11), 16989-17006. https://doi.org/10.1109/tits.2024.3421331
Journal Article Type | Article |
---|---|
Acceptance Date | Jun 26, 2024 |
Online Publication Date | Jul 12, 2024 |
Publication Date | 2024 |
Deposit Date | Aug 8, 2024 |
Journal | IEEE Transactions on Intelligent Transportation Systems |
Print ISSN | 1524-9050 |
Electronic ISSN | 1558-0016 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 25 |
Issue | 11 |
Pages | 16989-17006 |
DOI | https://doi.org/10.1109/tits.2024.3421331 |
Keywords | CNN, transformer, feature complementary fusion, crack detection |
You might also like
Predicting the relationships between virtual enterprises and agility in supply chains
(2017)
Journal Article
A practical multi-sensor activity recognition system for home-based care
(2014)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search