AVSE Challenge: Audio-Visual Speech Enhancement Challenge

Aldana Blanco, Andrea Lorena; Valentini-Botinhao, Cassia; Klejch, Ondrej; Gogate, Mandar; Dashtipour, Kia; Hussain, Amir; Bell, Peter

doi:10.1109/slt54892.2023.10023284

5G-IoT Cloud based Demonstration of Real-Time Audio-Visual Speech Enhancement for Multimodal Hearing-aids (2023)
Presentation / Conference Contribution
Gupta, A., Bishnu, A., Gogate, M., Dashtipour, K., Arslan, T., Adeel, A., Hussain, A., Ratnarajah, T., & Sellathurai, M. (2023, August). 5G-IoT Cloud based Demonstration of Real-Time Audio-Visual Speech Enhancement for Multimodal Hearing-aids. Presented at Interspeech 2023, Dublin, Ireland

Over twenty percent of the world's population suffers from some form of hearing loss, making it one of the most significant public health challenges. Current hearing aids commonly amplify noises while failing to improve speech comprehension in crowde... Read More about 5G-IoT Cloud based Demonstration of Real-Time Audio-Visual Speech Enhancement for Multimodal Hearing-aids.

Application for Real-time Audio-Visual Speech Enhancement (2023)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2023, August). Application for Real-time Audio-Visual Speech Enhancement. Presented at Interspeech 2023, Dublin, Ireland

This short paper demonstrates a first of its kind audio-visual (AV) speech enhancement (SE) desktop application that isolates, in real-time, the voice of a target speaker from noisy audio input. The deep neural network model integrated in this applic... Read More about Application for Real-time Audio-Visual Speech Enhancement.

A hybrid dependency-based approach for Urdu sentiment analysis (2023)
Journal Article
Sehar, U., Kanwal, S., Allheeib, N. I., Almari, S., Khan, F., Dashtipur, K., Gogate, M., & Khashan, O. A. (2023). A hybrid dependency-based approach for Urdu sentiment analysis. Scientific Reports, 13, Article 22075. https://doi.org/10.1038/s41598-023-48817-8

In the digital age, social media has emerged as a significant platform, generating a vast amount of raw data daily. This data reflects the opinions of individuals from diverse backgrounds, races, cultures, and age groups, spanning a wide range of top... Read More about A hybrid dependency-based approach for Urdu sentiment analysis.

Intrusion Detection Systems Using Machine Learning (2023)
Book Chapter
Taylor, W., Hussain, A., Gogate, M., Dashtipour, K., & Ahmad, J. (2024). Intrusion Detection Systems Using Machine Learning. In W. Boulila, J. Ahmad, A. Koubaa, M. Driss, & I. Riadh Farah (Eds.), Decision Making and Security Risk Management for IoT Environments (75-98). Springer. https://doi.org/10.1007/978-3-031-47590-0_5

Intrusion detection systems (IDS) have developed and evolved over time to form an important component in network security. The aim of an intrusion detection system is to successfully detect intrusions within a network and to trigger alerts to system... Read More about Intrusion Detection Systems Using Machine Learning.

Fake News in Social Media: Fake News Themes and Intentional Deception in the News and on Social Media (2023)
Book Chapter
Idrees, H., Dashtipour, K., Hussain, T., & Gogate, M. (2024). Fake News in Social Media: Fake News Themes and Intentional Deception in the News and on Social Media. In W. Boulila, J. Ahmad, A. Koubaa, M. Driss, & I. Riadh Farah (Eds.), Decision Making and Security Risk Management for IoT Environments (219-229). Springer. https://doi.org/10.1007/978-3-031-47590-0_11

From the start of the twenty-first century, online views and clicks have only increased. Within the last twenty years that has embedded through the use of social media. Within the last ten years news can spread on social media before it is shown on t... Read More about Fake News in Social Media: Fake News Themes and Intentional Deception in the News and on Social Media.

Solving the cocktail party problem using Multi-modal Hearing Assistive Technology Prototype (2023)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2023, December). Solving the cocktail party problem using Multi-modal Hearing Assistive Technology Prototype. Presented at Acoustics 2023, Sydney, Australia

Hearing loss is a major global health problem, affecting over 1.5 billion people. According to estimations by the World Health Organization, 83% of those who could benefit from hearing assistive devices do not use them. The limited adoption of hearin... Read More about Solving the cocktail party problem using Multi-modal Hearing Assistive Technology Prototype.

Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids (2023)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2023, June). Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

Classical audio-visual (AV) speech enhancement (SE) and separation methods have been successful at operating under constrained environments; however, the speech quality and intelligibility improvement is significantly reduced in unconstrained real-wo... Read More about Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids.

Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings (2023)
Presentation / Conference Contribution
Chern, I., Hung, K., Chen, Y., Hussain, T., Gogate, M., Hussain, A., Tsao, Y., & Hou, J. (2023, June). Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective for categorical problems such as automatic speech recognition and lip-reading. This suggests that useful audio-visual speech representations can be obtained via u... Read More about Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings.

Frequency-Domain Functional Links For Nonlinear Feedback Cancellation In Hearing Aids (2023)
Presentation / Conference Contribution
Nezamdoust, A., Gogate, M., Dashtipour, K., Hussain, A., & Comminiello, D. (2023, June). Frequency-Domain Functional Links For Nonlinear Feedback Cancellation In Hearing Aids. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

The problem of feedback cancellation can be seen as a function approximation task, which often is nonlinear in real-world hearing assistive technologies. Nonlinear methods adopted for this task must exhibit outstanding modeling performance and reduce... Read More about Frequency-Domain Functional Links For Nonlinear Feedback Cancellation In Hearing Aids.

Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings (2023)
Presentation / Conference Contribution
Chern, I., Hung, K., Chen, Y., Hussain, T., Gogate, M., Hussain, A., Tsao, Y., & Hou, J. (2023, June). Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective for categorical problems such as automatic speech recognition and lip-reading. This suggests that useful audio-visual speech representations can be obtained via u... Read More about Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings.

Sentiment Analysis Meets Explainable Artificial Intelligence: A Survey on Explainable Sentiment Analysis (2023)
Journal Article
Diwali, A., Saeedi, K., Dashtipour, K., Gogate, M., Cambria, E., & Hussain, A. (online). Sentiment Analysis Meets Explainable Artificial Intelligence: A Survey on Explainable Sentiment Analysis. IEEE Transactions on Affective Computing, https://doi.org/10.1109/taffc.2023.3296373

Sentiment analysis can be used to derive knowledge that is connected to emotions and opinions from textual data generated by people. As computer power has grown, and the availability of benchmark datasets has increased, deep learning models based on... Read More about Sentiment Analysis Meets Explainable Artificial Intelligence: A Survey on Explainable Sentiment Analysis.

Steel surface defect detection based on self-supervised contrastive representation learning with matching metric (2023)
Journal Article
Hu, X., Yang, J., Jiang, F., Hussain, A., Dashtipour, K., & Gogate, M. (2023). Steel surface defect detection based on self-supervised contrastive representation learning with matching metric. Applied Soft Computing, 145, Article 110578. https://doi.org/10.1016/j.asoc.2023.110578

Defect detection is crucial in the quality control of industrial applications. Existing supervised methods are heavily reliant on the large amounts of labeled data. However, labeled data in some specific fields are still scarce, and it requires profe... Read More about Steel surface defect detection based on self-supervised contrastive representation learning with matching metric.

Arabic Sentiment Analysis Based on Word Embeddings and Deep Learning (2023)
Journal Article
Elhassan, N., Varone, G., Ahmed, R., Gogate, M., Dashtipour, K., Almoamari, H., El-Affendi, M. A., Al-Tamimi, B. N., Albalwy, F., & Hussain, A. (2023). Arabic Sentiment Analysis Based on Word Embeddings and Deep Learning. Computers, 12(6), Article 126. https://doi.org/10.3390/computers12060126

Social media networks have grown exponentially over the last two decades, providing the opportunity for users of the internet to communicate and exchange ideas on a variety of topics. The outcome is that opinion mining plays a crucial role in analyzi... Read More about Arabic Sentiment Analysis Based on Word Embeddings and Deep Learning.

Towards individualised speech enhancement: An SNR preference learning system for multi-modal hearing aids (2023)
Presentation / Conference Contribution
Kirton-Wingate, J., Ahmed, S., Gogate, M., Tsao, Y., & Hussain, A. (2023, June). Towards individualised speech enhancement: An SNR preference learning system for multi-modal hearing aids. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

Since the advent of deep learning (DL), speech enhancement (SE) models have performed well under a variety of noise conditions. However, such systems may still introduce sonic artefacts, sound unnatural, and restrict the ability for a user to hear am... Read More about Towards individualised speech enhancement: An SNR preference learning system for multi-modal hearing aids.

Live Demonstration: Real-time Multi-modal Hearing Assistive Technology Prototype (2023)
Presentation / Conference Contribution
Gogate, M., Hussain, A., Dashtipour, K., & Hussain, A. (2023). Live Demonstration: Real-time Multi-modal Hearing Assistive Technology Prototype. In IEEE ISCAS 2023 Symposium Proceedings. https://doi.org/10.1109/iscas46773.2023.10182070

Hearing loss affects at least 1.5 billion people globally. The WHO estimates 83% of people who could benefit from hearing aids do not use them. Barriers to HA uptake are multifaceted but include ineffectiveness of current HA technology in noisy envir... Read More about Live Demonstration: Real-time Multi-modal Hearing Assistive Technology Prototype.

Live Demonstration: Cloud-based Audio-Visual Speech Enhancement in Multimodal Hearing-aids (2023)
Presentation / Conference Contribution
Bishnu, A., Gupta, A., Gogate, M., Dashtipour, K., Arslan, T., Adeel, A., Hussain, A., Sellathurai, M., & Ratnarajah, T. (2023, May). Live Demonstration: Cloud-based Audio-Visual Speech Enhancement in Multimodal Hearing-aids. Presented at 2023 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, California

Hearing loss is among the most serious public health problems, affecting as much as 20% of the worldwide population. Even cutting-edge multi-channel audio-only speech enhancement (SE) algorithms used in modern hearing aids face significant hurdles si... Read More about Live Demonstration: Cloud-based Audio-Visual Speech Enhancement in Multimodal Hearing-aids.

The P vs. NP Problem and Attempts to Settle It via Perfect Graphs State-of-the-Art Approach (2023)
Presentation / Conference Contribution
Heal, M., Dashtipour, K., & Gogate, M. (2023, March). The P vs. NP Problem and Attempts to Settle It via Perfect Graphs State-of-the-Art Approach. Presented at 2023 Future of Information and Communication Conference (FICC), San Francisco, CA

The P vs. NP problem is a major problem in computer science. It is perhaps the most celebrated outstanding problem in that domain. Its solution would have a tremendous impact on different fields such as mathematics, cryptography, algorithm research,... Read More about The P vs. NP Problem and Attempts to Settle It via Perfect Graphs State-of-the-Art Approach.

A Novel Hierarchical Extreme Machine-Learning-Based Approach for Linear Attenuation Coefficient Forecasting (2023)
Journal Article
Varone, G., Ieracitano, C., Çiftçioğlu, A. Ö., Hussain, T., Gogate, M., Dashtipour, K., Al-Tamimi, B. N., Almoamari, H., Akkurt, I., & Hussain, A. (2023). A Novel Hierarchical Extreme Machine-Learning-Based Approach for Linear Attenuation Coefficient Forecasting. Entropy, 25(2), Article 253. https://doi.org/10.3390/e25020253

The development of reinforced polymer composite materials has had a significant influence on the challenging problem of shielding against high-energy photons, particularly X-rays and γ-rays in industrial and healthcare facilities. Heavy materials’ sh... Read More about A Novel Hierarchical Extreme Machine-Learning-Based Approach for Linear Attenuation Coefficient Forecasting.

AVSE Challenge: Audio-Visual Speech Enhancement Challenge (2023)
Presentation / Conference Contribution
Aldana Blanco, A. L., Valentini-Botinhao, C., Klejch, O., Gogate, M., Dashtipour, K., Hussain, A., & Bell, P. (2023, January). AVSE Challenge: Audio-Visual Speech Enhancement Challenge. Presented at 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar

Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too chal... Read More about AVSE Challenge: Audio-Visual Speech Enhancement Challenge.

All Outputs (19)