Zhenhong Zou
A novel multimodal fusion network based on a joint-coding model for lane line segmentation
Zou, Zhenhong; Zhang, Xinyu; Liu, Huaping; Li, Zhiwei; Hussain, Amir; Li, Jun
Abstract
There has recently been growing interest in utilizing multimodal sensors to achieve robust lane line segmentation. In this paper, we introduce a novel multimodal fusion architecture from an information theory perspective, and demonstrate its practical utility using Light Detection and Ranging (LiDAR) camera fusion networks. In particular, we develop, for the first time, a multimodal fusion network as a joint coding model, where each single node, layer, and pipeline is represented as a channel. The forward propagation is thus equal to the information transmission in the channels. Then, we can qualitatively and quantitatively analyze the effect of different fusion approaches. We argue the optimal fusion architecture is related to the essential capacity and its allocation based on the source and channel. To test this multimodal fusion hypothesis, we progressively determine a series of multimodal models based on the proposed fusion methods and evaluate them on the KITTI and the A2D2 datasets. Our optimal fusion network achieves 85%+ lane line accuracy and 98.7%+ overall. The performance gap among the models will inform continuing future research into development of optimal fusion algorithms for the deep multimodal learning community.
Citation
Zou, Z., Zhang, X., Liu, H., Li, Z., Hussain, A., & Li, J. (2022). A novel multimodal fusion network based on a joint-coding model for lane line segmentation. Information Fusion, 80, 167-178. https://doi.org/10.1016/j.inffus.2021.10.008
Journal Article Type | Article |
---|---|
Acceptance Date | Oct 31, 2021 |
Online Publication Date | Nov 13, 2021 |
Publication Date | 2022-04 |
Deposit Date | Jan 5, 2022 |
Journal | Information Fusion |
Print ISSN | 1566-2535 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 80 |
Pages | 167-178 |
DOI | https://doi.org/10.1016/j.inffus.2021.10.008 |
Keywords | Multimodal fusion, Information theory, Lane line segmentation, Semantic segmentation, Neural Network |
Public URL | http://researchrepository.napier.ac.uk/Output/2830147 |
You might also like
MTFDN: An image copy‐move forgery detection method based on multi‐task learning
(2024)
Journal Article
Transition-aware human activity recognition using an ensemble deep learning framework
(2024)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search