Skip to main content

Research Repository

Advanced Search

All Outputs (173)

Deep Learning-Based Receiver Design for IoT Multi-User Uplink 5G-NR System (2024)
Presentation / Conference Contribution
Gupta, A., Bishnu, A., Ratnarajah, T., Adeel, A., Hussain, A., & Sellathurai, M. (2023, December). Deep Learning-Based Receiver Design for IoT Multi-User Uplink 5G-NR System. Presented at GLOBECOM 2023 - 2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia

Designing an efficient receiver for multiple users transmitting orthogonal frequency-division multiplexing signals to the base station remain a challenging interference-limited problem in 5G-new radio (5G-NR) system. This can lead to stagnation of de... Read More about Deep Learning-Based Receiver Design for IoT Multi-User Uplink 5G-NR System.

5G-IoT Cloud based Demonstration of Real-Time Audio-Visual Speech Enhancement for Multimodal Hearing-aids (2023)
Presentation / Conference Contribution
Gupta, A., Bishnu, A., Gogate, M., Dashtipour, K., Arslan, T., Adeel, A., Hussain, A., Ratnarajah, T., & Sellathurai, M. (2023, August). 5G-IoT Cloud based Demonstration of Real-Time Audio-Visual Speech Enhancement for Multimodal Hearing-aids. Presented at Interspeech 2023, Dublin, Ireland

Over twenty percent of the world's population suffers from some form of hearing loss, making it one of the most significant public health challenges. Current hearing aids commonly amplify noises while failing to improve speech comprehension in crowde... Read More about 5G-IoT Cloud based Demonstration of Real-Time Audio-Visual Speech Enhancement for Multimodal Hearing-aids.

Application for Real-time Audio-Visual Speech Enhancement (2023)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2023, August). Application for Real-time Audio-Visual Speech Enhancement. Presented at Interspeech 2023, Dublin, Ireland

This short paper demonstrates a first of its kind audio-visual (AV) speech enhancement (SE) desktop application that isolates, in real-time, the voice of a target speaker from noisy audio input. The deep neural network model integrated in this applic... Read More about Application for Real-time Audio-Visual Speech Enhancement.

Solving the cocktail party problem using Multi-modal Hearing Assistive Technology Prototype (2023)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2023, December). Solving the cocktail party problem using Multi-modal Hearing Assistive Technology Prototype. Presented at Acoustics 2023, Sydney, Australia

Hearing loss is a major global health problem, affecting over 1.5 billion people. According to estimations by the World Health Organization, 83% of those who could benefit from hearing assistive devices do not use them. The limited adoption of hearin... Read More about Solving the cocktail party problem using Multi-modal Hearing Assistive Technology Prototype.

Resolving the Decreased Rank Attack in RPL’s IoT Networks (2023)
Presentation / Conference Contribution
Ghaleb, B., Al-Duba, A., Hussain, A., Romdhani, I., & Jaroucheh, Z. (2023, June). Resolving the Decreased Rank Attack in RPL’s IoT Networks. Presented at 19th Annual International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT 2023), Pafos, Cyprus

The Routing Protocol for Low power and Lossy networks (RPL) has been developed by the Internet Engineering Task Force (IETF) standardization body to serve as a part of the 6LoWPAN (IPv6 over Low-Power Wireless Personal Area Networks) standard, a core... Read More about Resolving the Decreased Rank Attack in RPL’s IoT Networks.

Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids (2023)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2023, June). Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

Classical audio-visual (AV) speech enhancement (SE) and separation methods have been successful at operating under constrained environments; however, the speech quality and intelligibility improvement is significantly reduced in unconstrained real-wo... Read More about Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids.

Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings (2023)
Presentation / Conference Contribution
Chern, I., Hung, K., Chen, Y., Hussain, T., Gogate, M., Hussain, A., Tsao, Y., & Hou, J. (2023, June). Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective for categorical problems such as automatic speech recognition and lip-reading. This suggests that useful audio-visual speech representations can be obtained via u... Read More about Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings.

Frequency-Domain Functional Links For Nonlinear Feedback Cancellation In Hearing Aids (2023)
Presentation / Conference Contribution
Nezamdoust, A., Gogate, M., Dashtipour, K., Hussain, A., & Comminiello, D. (2023, June). Frequency-Domain Functional Links For Nonlinear Feedback Cancellation In Hearing Aids. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

The problem of feedback cancellation can be seen as a function approximation task, which often is nonlinear in real-world hearing assistive technologies. Nonlinear methods adopted for this task must exhibit outstanding modeling performance and reduce... Read More about Frequency-Domain Functional Links For Nonlinear Feedback Cancellation In Hearing Aids.

Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings (2023)
Presentation / Conference Contribution
Chern, I., Hung, K., Chen, Y., Hussain, T., Gogate, M., Hussain, A., Tsao, Y., & Hou, J. (2023, June). Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective for categorical problems such as automatic speech recognition and lip-reading. This suggests that useful audio-visual speech representations can be obtained via u... Read More about Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings.

Towards individualised speech enhancement: An SNR preference learning system for multi-modal hearing aids (2023)
Presentation / Conference Contribution
Kirton-Wingate, J., Ahmed, S., Gogate, M., Tsao, Y., & Hussain, A. (2023, June). Towards individualised speech enhancement: An SNR preference learning system for multi-modal hearing aids. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

Since the advent of deep learning (DL), speech enhancement (SE) models have performed well under a variety of noise conditions. However, such systems may still introduce sonic artefacts, sound unnatural, and restrict the ability for a user to hear am... Read More about Towards individualised speech enhancement: An SNR preference learning system for multi-modal hearing aids.

Live Demonstration: Real-time Multi-modal Hearing Assistive Technology Prototype (2023)
Presentation / Conference Contribution
Gogate, M., Hussain, A., Dashtipour, K., & Hussain, A. (2023). Live Demonstration: Real-time Multi-modal Hearing Assistive Technology Prototype. In IEEE ISCAS 2023 Symposium Proceedings. https://doi.org/10.1109/iscas46773.2023.10182070

Hearing loss affects at least 1.5 billion people globally. The WHO estimates 83% of people who could benefit from hearing aids do not use them. Barriers to HA uptake are multifaceted but include ineffectiveness of current HA technology in noisy envir... Read More about Live Demonstration: Real-time Multi-modal Hearing Assistive Technology Prototype.

Live Demonstration: Cloud-based Audio-Visual Speech Enhancement in Multimodal Hearing-aids (2023)
Presentation / Conference Contribution
Bishnu, A., Gupta, A., Gogate, M., Dashtipour, K., Arslan, T., Adeel, A., Hussain, A., Sellathurai, M., & Ratnarajah, T. (2023, May). Live Demonstration: Cloud-based Audio-Visual Speech Enhancement in Multimodal Hearing-aids. Presented at 2023 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, California

Hearing loss is among the most serious public health problems, affecting as much as 20% of the worldwide population. Even cutting-edge multi-channel audio-only speech enhancement (SE) algorithms used in modern hearing aids face significant hurdles si... Read More about Live Demonstration: Cloud-based Audio-Visual Speech Enhancement in Multimodal Hearing-aids.

AVSE Challenge: Audio-Visual Speech Enhancement Challenge (2023)
Presentation / Conference Contribution
Aldana Blanco, A. L., Valentini-Botinhao, C., Klejch, O., Gogate, M., Dashtipour, K., Hussain, A., & Bell, P. (2023, January). AVSE Challenge: Audio-Visual Speech Enhancement Challenge. Presented at 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar

Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too chal... Read More about AVSE Challenge: Audio-Visual Speech Enhancement Challenge.

A Novel Frame Structure for Cloud-Based Audio-Visual Speech Enhancement in Multimodal Hearing-aids (2022)
Presentation / Conference Contribution
Bishnu, A., Gupta, A., Gogate, M., Dashtipour, K., Adeel, A., Hussain, A., Sellathurai, M., & Ratnarajah, T. (2022, October). A Novel Frame Structure for Cloud-Based Audio-Visual Speech Enhancement in Multimodal Hearing-aids. Presented at 2022 IEEE International Conference on E-health Networking, Application & Services (HealthCom), Genoa, Italy

In this paper, we design a first of its kind transceiver (PHY layer) prototype for cloud-based audio-visual (AV) speech enhancement (SE) complying with high data rate and low latency requirements of future multimodal hearing assistive technology. The... Read More about A Novel Frame Structure for Cloud-Based Audio-Visual Speech Enhancement in Multimodal Hearing-aids.

Towards real-time privacy-preserving audio-visual speech enhancement (2022)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2022, September). Towards real-time privacy-preserving audio-visual speech enhancement. Presented at 2nd Symposium on Security and Privacy in Speech Communication, Incheon, Korea

Human auditory cortex in everyday noisy situations is known to exploit aural and visual cues that are contextually combined by the brain’s multi-level integration strategies to selectively suppress the background noise and focus on the target speaker... Read More about Towards real-time privacy-preserving audio-visual speech enhancement.

A Novel Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning (2022)
Presentation / Conference Contribution
Hussain, T., Diyan, M., Gogate, M., Dashtipour, K., Adeel, A., Tsao, Y., & Hussain, A. (2022, July). A Novel Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning. Presented at 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Glasgow, Scotland

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are often trained to minimise the feature distance between noise-free speech and enhanced speech signals. Despite improving the speech quality, su... Read More about A Novel Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning.

Towards intelligibility-oriented audio-visual speech enhancement (2021)
Presentation / Conference Contribution
Hussain, T., Gogate, M., Dashtipour, K., & Hussain, A. (2021, September). Towards intelligibility-oriented audio-visual speech enhancement. Presented at The Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2021), Online

Existing deep learning (DL) based approaches are generally optimised to minimise the distance between clean and enhanced speech features. These often result in improved speech quality however they suffer from a lack of generalisation and may not deli... Read More about Towards intelligibility-oriented audio-visual speech enhancement.

An Attribute Weight Estimation Using Particle Swarm Optimization and Machine Learning Approaches for Customer Churn Prediction (2021)
Presentation / Conference Contribution
Kanwal, S., Rashid, J., Kim, J., Nisar, M. W., Hussain, A., Batool, S., & Kanwal, R. (2021, November). An Attribute Weight Estimation Using Particle Swarm Optimization and Machine Learning Approaches for Customer Churn Prediction. Presented at 2021 International Conference on Innovative Computing (ICIC), Lahore, Pakistan

One of the most challenging problems in the telecommunications industry is predicting customer churn (CCP). Decision-makers and business experts stressed that acquiring new clients is more expensive than maintaining current ones. From current churn d... Read More about An Attribute Weight Estimation Using Particle Swarm Optimization and Machine Learning Approaches for Customer Churn Prediction.

Visual Speech In Real Noisy Environments (VISION): A Novel Benchmark Dataset and Deep Learning-Based Baseline System. (2020)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2020, October). Visual Speech In Real Noisy Environments (VISION): A Novel Benchmark Dataset and Deep Learning-Based Baseline System. Presented at Interspeech 2020, Shanghai, China

In this paper, we present VIsual Speech In real nOisy eNvironments (VISION), a first of its kind audio-visual (AV) corpus comprising 2500 utterances from 209 speakers, recorded in real noisy environments including social gatherings, streets, cafeteri... Read More about Visual Speech In Real Noisy Environments (VISION): A Novel Benchmark Dataset and Deep Learning-Based Baseline System..

Deep Neural Network Driven Binaural Audio Visual Speech Separation (2020)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., Bell, P., & Hussain, A. (2020, July). Deep Neural Network Driven Binaural Audio Visual Speech Separation. Presented at 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow

The central auditory pathway exploits the auditory signals and visual information sent by both ears and eyes to segregate speech from multiple competing noise sources and help disambiguate phonological ambiguity. In this study, inspired from this uni... Read More about Deep Neural Network Driven Binaural Audio Visual Speech Separation.