Skip to main content

Research Repository

Advanced Search

Outputs (33)

5G-IoT Cloud based Demonstration of Real-Time Audio-Visual Speech Enhancement for Multimodal Hearing-aids (2023)
Presentation / Conference Contribution
Gupta, A., Bishnu, A., Gogate, M., Dashtipour, K., Arslan, T., Adeel, A., Hussain, A., Ratnarajah, T., & Sellathurai, M. (2023, August). 5G-IoT Cloud based Demonstration of Real-Time Audio-Visual Speech Enhancement for Multimodal Hearing-aids. Presented at Interspeech 2023, Dublin, Ireland

Over twenty percent of the world's population suffers from some form of hearing loss, making it one of the most significant public health challenges. Current hearing aids commonly amplify noises while failing to improve speech comprehension in crowde... Read More about 5G-IoT Cloud based Demonstration of Real-Time Audio-Visual Speech Enhancement for Multimodal Hearing-aids.

Application for Real-time Audio-Visual Speech Enhancement (2023)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2023, August). Application for Real-time Audio-Visual Speech Enhancement. Presented at Interspeech 2023, Dublin, Ireland

This short paper demonstrates a first of its kind audio-visual (AV) speech enhancement (SE) desktop application that isolates, in real-time, the voice of a target speaker from noisy audio input. The deep neural network model integrated in this applic... Read More about Application for Real-time Audio-Visual Speech Enhancement.

Solving the cocktail party problem using Multi-modal Hearing Assistive Technology Prototype (2023)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2023, December). Solving the cocktail party problem using Multi-modal Hearing Assistive Technology Prototype. Presented at Acoustics 2023, Sydney, Australia

Hearing loss is a major global health problem, affecting over 1.5 billion people. According to estimations by the World Health Organization, 83% of those who could benefit from hearing assistive devices do not use them. The limited adoption of hearin... Read More about Solving the cocktail party problem using Multi-modal Hearing Assistive Technology Prototype.

Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings (2023)
Presentation / Conference Contribution
Chern, I., Hung, K., Chen, Y., Hussain, T., Gogate, M., Hussain, A., Tsao, Y., & Hou, J. (2023, June). Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective for categorical problems such as automatic speech recognition and lip-reading. This suggests that useful audio-visual speech representations can be obtained via u... Read More about Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings.

Frequency-Domain Functional Links For Nonlinear Feedback Cancellation In Hearing Aids (2023)
Presentation / Conference Contribution
Nezamdoust, A., Gogate, M., Dashtipour, K., Hussain, A., & Comminiello, D. (2023, June). Frequency-Domain Functional Links For Nonlinear Feedback Cancellation In Hearing Aids. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

The problem of feedback cancellation can be seen as a function approximation task, which often is nonlinear in real-world hearing assistive technologies. Nonlinear methods adopted for this task must exhibit outstanding modeling performance and reduce... Read More about Frequency-Domain Functional Links For Nonlinear Feedback Cancellation In Hearing Aids.

Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings (2023)
Presentation / Conference Contribution
Chern, I., Hung, K., Chen, Y., Hussain, T., Gogate, M., Hussain, A., Tsao, Y., & Hou, J. (2023, June). Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective for categorical problems such as automatic speech recognition and lip-reading. This suggests that useful audio-visual speech representations can be obtained via u... Read More about Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings.

Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids (2023)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2023, June). Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

Classical audio-visual (AV) speech enhancement (SE) and separation methods have been successful at operating under constrained environments; however, the speech quality and intelligibility improvement is significantly reduced in unconstrained real-wo... Read More about Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids.

Towards individualised speech enhancement: An SNR preference learning system for multi-modal hearing aids (2023)
Presentation / Conference Contribution
Kirton-Wingate, J., Ahmed, S., Gogate, M., Tsao, Y., & Hussain, A. (2023, June). Towards individualised speech enhancement: An SNR preference learning system for multi-modal hearing aids. Presented at 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Rhodes Island, Greece

Since the advent of deep learning (DL), speech enhancement (SE) models have performed well under a variety of noise conditions. However, such systems may still introduce sonic artefacts, sound unnatural, and restrict the ability for a user to hear am... Read More about Towards individualised speech enhancement: An SNR preference learning system for multi-modal hearing aids.

Live Demonstration: Real-time Multi-modal Hearing Assistive Technology Prototype (2023)
Presentation / Conference Contribution
Gogate, M., Hussain, A., Dashtipour, K., & Hussain, A. (2023). Live Demonstration: Real-time Multi-modal Hearing Assistive Technology Prototype. In IEEE ISCAS 2023 Symposium Proceedings. https://doi.org/10.1109/iscas46773.2023.10182070

Hearing loss affects at least 1.5 billion people globally. The WHO estimates 83% of people who could benefit from hearing aids do not use them. Barriers to HA uptake are multifaceted but include ineffectiveness of current HA technology in noisy envir... Read More about Live Demonstration: Real-time Multi-modal Hearing Assistive Technology Prototype.

Live Demonstration: Cloud-based Audio-Visual Speech Enhancement in Multimodal Hearing-aids (2023)
Presentation / Conference Contribution
Bishnu, A., Gupta, A., Gogate, M., Dashtipour, K., Arslan, T., Adeel, A., Hussain, A., Sellathurai, M., & Ratnarajah, T. (2023, May). Live Demonstration: Cloud-based Audio-Visual Speech Enhancement in Multimodal Hearing-aids. Presented at 2023 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, California

Hearing loss is among the most serious public health problems, affecting as much as 20% of the worldwide population. Even cutting-edge multi-channel audio-only speech enhancement (SE) algorithms used in modern hearing aids face significant hurdles si... Read More about Live Demonstration: Cloud-based Audio-Visual Speech Enhancement in Multimodal Hearing-aids.

The P vs. NP Problem and Attempts to Settle It via Perfect Graphs State-of-the-Art Approach (2023)
Presentation / Conference Contribution
Heal, M., Dashtipour, K., & Gogate, M. (2023, March). The P vs. NP Problem and Attempts to Settle It via Perfect Graphs State-of-the-Art Approach. Presented at 2023 Future of Information and Communication Conference (FICC), San Francisco, CA

The P vs. NP problem is a major problem in computer science. It is perhaps the most celebrated outstanding problem in that domain. Its solution would have a tremendous impact on different fields such as mathematics, cryptography, algorithm research,... Read More about The P vs. NP Problem and Attempts to Settle It via Perfect Graphs State-of-the-Art Approach.

AVSE Challenge: Audio-Visual Speech Enhancement Challenge (2023)
Presentation / Conference Contribution
Aldana Blanco, A. L., Valentini-Botinhao, C., Klejch, O., Gogate, M., Dashtipour, K., Hussain, A., & Bell, P. (2023, January). AVSE Challenge: Audio-Visual Speech Enhancement Challenge. Presented at 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar

Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too chal... Read More about AVSE Challenge: Audio-Visual Speech Enhancement Challenge.

Formulations and Algorithms to Find Maximal and Maximum Independent Sets of Graphs (2022)
Presentation / Conference Contribution
Heal, M., Dashtipour, K., & Gogate, M. (2022, December). Formulations and Algorithms to Find Maximal and Maximum Independent Sets of Graphs. Presented at 2022 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, Nevada

We propose four algorithms to find maximal and maximum independent sets of graphs. Two of the algorithms are non-polynomial in time, mainly binary programming and non-convex multi-variable polynomial programming algorithms. Two other algorithms run i... Read More about Formulations and Algorithms to Find Maximal and Maximum Independent Sets of Graphs.

A Novel Frame Structure for Cloud-Based Audio-Visual Speech Enhancement in Multimodal Hearing-aids (2022)
Presentation / Conference Contribution
Bishnu, A., Gupta, A., Gogate, M., Dashtipour, K., Adeel, A., Hussain, A., Sellathurai, M., & Ratnarajah, T. (2022, October). A Novel Frame Structure for Cloud-Based Audio-Visual Speech Enhancement in Multimodal Hearing-aids. Presented at 2022 IEEE International Conference on E-health Networking, Application & Services (HealthCom), Genoa, Italy

In this paper, we design a first of its kind transceiver (PHY layer) prototype for cloud-based audio-visual (AV) speech enhancement (SE) complying with high data rate and low latency requirements of future multimodal hearing assistive technology. The... Read More about A Novel Frame Structure for Cloud-Based Audio-Visual Speech Enhancement in Multimodal Hearing-aids.

Towards real-time privacy-preserving audio-visual speech enhancement (2022)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2022, September). Towards real-time privacy-preserving audio-visual speech enhancement. Presented at 2nd Symposium on Security and Privacy in Speech Communication, Incheon, Korea

Human auditory cortex in everyday noisy situations is known to exploit aural and visual cues that are contextually combined by the brain’s multi-level integration strategies to selectively suppress the background noise and focus on the target speaker... Read More about Towards real-time privacy-preserving audio-visual speech enhancement.

A Novel Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning (2022)
Presentation / Conference Contribution
Hussain, T., Diyan, M., Gogate, M., Dashtipour, K., Adeel, A., Tsao, Y., & Hussain, A. (2022, July). A Novel Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning. Presented at 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Glasgow, Scotland

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are often trained to minimise the feature distance between noise-free speech and enhanced speech signals. Despite improving the speech quality, su... Read More about A Novel Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning.

Detecting Alzheimer’s Disease Using Machine Learning Methods (2022)
Presentation / Conference Contribution
Dashtipour, K., Taylor, W., Ansari, S., Zahid, A., Gogate, M., Ahmad, J., Assaleh, K., Arshad, K., Ali Imran, M., & Abbasi, Q. (2021, October). Detecting Alzheimer’s Disease Using Machine Learning Methods. Presented at 16th EAI International Conference, BODYNETS 2021, Online

As the world is experiencing population growth, the portion of the older people, aged 65 and above, is also growing at a faster rate. As a result, the dementia with Alzheimer’s disease is expected to increase rapidly in the next few years. Currently,... Read More about Detecting Alzheimer’s Disease Using Machine Learning Methods.

Comparing the Performance of Different Classifiers for Posture Detection (2022)
Presentation / Conference Contribution
Suresh Kumar, S., Dashtipour, K., Gogate, M., Ahmad, J., Assaleh, K., Arshad, K., Imran, M. A., Abbasi, Q., & Ahmad, W. (2021, October). Comparing the Performance of Different Classifiers for Posture Detection. Presented at 16th EAI International Conference, BODYNETS 2021, Online

Human Posture Classification (HPC) is used in many fields such as human computer interfacing, security surveillance, rehabilitation, remote monitoring, and so on. This paper compares the performance of different classifiers in the detection of 3 post... Read More about Comparing the Performance of Different Classifiers for Posture Detection.

Towards intelligibility-oriented audio-visual speech enhancement (2021)
Presentation / Conference Contribution
Hussain, T., Gogate, M., Dashtipour, K., & Hussain, A. (2021, September). Towards intelligibility-oriented audio-visual speech enhancement. Presented at The Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2021), Online

Existing deep learning (DL) based approaches are generally optimised to minimise the distance between clean and enhanced speech features. These often result in improved speech quality however they suffer from a lack of generalisation and may not deli... Read More about Towards intelligibility-oriented audio-visual speech enhancement.

Visual Speech In Real Noisy Environments (VISION): A Novel Benchmark Dataset and Deep Learning-Based Baseline System. (2020)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., & Hussain, A. (2020, October). Visual Speech In Real Noisy Environments (VISION): A Novel Benchmark Dataset and Deep Learning-Based Baseline System. Presented at Interspeech 2020, Shanghai, China

In this paper, we present VIsual Speech In real nOisy eNvironments (VISION), a first of its kind audio-visual (AV) corpus comprising 2500 utterances from 209 speakers, recorded in real noisy environments including social gatherings, streets, cafeteri... Read More about Visual Speech In Real Noisy Environments (VISION): A Novel Benchmark Dataset and Deep Learning-Based Baseline System..

Deep Neural Network Driven Binaural Audio Visual Speech Separation (2020)
Presentation / Conference Contribution
Gogate, M., Dashtipour, K., Bell, P., & Hussain, A. (2020, July). Deep Neural Network Driven Binaural Audio Visual Speech Separation. Presented at 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow

The central auditory pathway exploits the auditory signals and visual information sent by both ears and eyes to segregate speech from multiple competing noise sources and help disambiguate phonological ambiguity. In this study, inspired from this uni... Read More about Deep Neural Network Driven Binaural Audio Visual Speech Separation.

Offline Arabic Handwriting Recognition Using Deep Machine Learning: A Review of Recent Advances (2020)
Presentation / Conference Contribution
Ahmed, R., Dashtipour, K., Gogate, M., Raza, A., Zhang, R., Huang, K., Hawalah, A., Adeel, A., & Hussain, A. (2019, July). Offline Arabic Handwriting Recognition Using Deep Machine Learning: A Review of Recent Advances. Presented at 10th International Conference, BICS 2019, Guangzhou, China

In pattern recognition, automatic handwriting recognition (AHWR) is an area of research that has developed rapidly in the last few years. It can play a significant role in broad-spectrum of applications rending from, bank cheque processing, applicati... Read More about Offline Arabic Handwriting Recognition Using Deep Machine Learning: A Review of Recent Advances.

Random Features and Random Neurons for Brain-Inspired Big Data Analytics (2020)
Presentation / Conference Contribution
Gogate, M., Hussain, A., & Huang, K. (2019, November). Random Features and Random Neurons for Brain-Inspired Big Data Analytics. Presented at 2019 International Conference on Data Mining Workshops (ICDMW), Beijing, China

With the explosion of Big Data, fast and frugal reasoning algorithms are increasingly needed to keep up with the size and the pace of user-generated contents on the Web. In many real-time applications, it is preferable to be able to process more data... Read More about Random Features and Random Neurons for Brain-Inspired Big Data Analytics.

Statistical Analysis Driven Optimized Deep Learning System for Intrusion Detection (2018)
Presentation / Conference Contribution
Ieracitano, C., Adeel, A., Gogate, M., Dashtipour, K., Morabito, F., Larijani, H., …Hussain, A. (2018). Statistical Analysis Driven Optimized Deep Learning System for Intrusion Detection. . https://doi.org/10.1007/978-3-030-00563-4_74

Attackers have developed ever more sophisticated and intelligent ways to hack information and communication technology (ICT) systems. The extent of damage an individual hacker can carry out upon infiltrating a system is well understood. A potentially... Read More about Statistical Analysis Driven Optimized Deep Learning System for Intrusion Detection.

Exploiting Deep Learning for Persian Sentiment Analysis (2018)
Presentation / Conference Contribution
Dashtipour, K., Gogate, M., Adeel, A., Ieracitano, C., Larijani, H., & Hussain, A. (2018, July). Exploiting Deep Learning for Persian Sentiment Analysis. Presented at 9th International Conference, BICS 2018, Xi'an, China

The rise of social media is enabling people to freely express their opinions about products and services. The aim of sentiment analysis is to automatically determine subject’s sentiment (e.g., positive, negative, or neutral) towards a particular aspe... Read More about Exploiting Deep Learning for Persian Sentiment Analysis.

A comparative study of Persian sentiment analysis based on different feature combinations (2018)
Presentation / Conference Contribution
Dashtipour, K., Gogate, M., Adeel, A., Hussain, A., Alqarafi, A., & Durrani, T. (2017, July). A comparative study of Persian sentiment analysis based on different feature combinations. Presented at International Conference in Communications, Signal Processing, and Systems, Harbin, China

In recent years, the use of internet and correspondingly the number of online reviews, comments and opinions have increased significantly. It is indeed very difficult for humans to read these opinions and classify them accurately. Consequently, there... Read More about A comparative study of Persian sentiment analysis based on different feature combinations.

Toward's Arabic multi-modal sentiment analysis (2018)
Presentation / Conference Contribution
Alqarafi, A., Adeel, A., Gogate, M., Dashitpour, K., Hussain, A., & Durrani, T. (2019). Toward's Arabic multi-modal sentiment analysis. . https://doi.org/10.1007/978-981-10-6571-2_290

In everyday life, people use internet to express and share opinions, facts, and sentiments about products and services. In addition, social media applications such as Facebook, Twitter, WhatsApp, Snapchat etc., have become important information shari... Read More about Toward's Arabic multi-modal sentiment analysis.

A novel brain-inspired compression-based optimised multimodal fusion for emotion recognition (2018)
Presentation / Conference Contribution
Gogate, M., Adeel, A., & Hussain, A. (2018). A novel brain-inspired compression-based optimised multimodal fusion for emotion recognition. . https://doi.org/10.1109/SSCI.2017.8285377

The curse of dimensionality is a well-established phenomenon. However, the properties of high dimensional data are often poorly understood and overlooked during the process of data modelling and analysis. Similarly, how to optimally fuse different mo... Read More about A novel brain-inspired compression-based optimised multimodal fusion for emotion recognition.

Deep learning driven multimodal fusion for automated deception detection (2018)
Presentation / Conference Contribution
Gogate, M., Adeel, A., & Hussain, A. (2018). Deep learning driven multimodal fusion for automated deception detection. . https://doi.org/10.1109/SSCI.2017.8285382

Humans ability to detect lies is no more accurate than chance according to the American Psychological Association. The state-of-the-art deception detection methods, such as deception detection stem from early theories and polygraph have proven to be... Read More about Deep learning driven multimodal fusion for automated deception detection.

Towards Next-Generation Lip-Reading Driven Hearing-Aids: A preliminary Prototype Demo (2017)
Presentation / Conference Contribution
Adeel, A., Gogate, M., & Hussain, A. (2017, August). Towards Next-Generation Lip-Reading Driven Hearing-Aids: A preliminary Prototype Demo. Presented at 1st International Workshop on Challenges in Hearing Assistive Technology (CHAT 2017), Stockholm, Sweden

Speech enhancement aims to enhance the perceived speech quality and intelligibility in the presence of noise. Classical speech enhancement methods are mainly based on audio only processing which often perform poorly in adverse conditions, where overw... Read More about Towards Next-Generation Lip-Reading Driven Hearing-Aids: A preliminary Prototype Demo.

Persian Named Entity Recognition (2017)
Presentation / Conference Contribution
Dashtipour, K., Gogate, M., Adeel, A., Algarafi, A., Howard, N., & Hussain, A. (2017). Persian Named Entity Recognition. . https://doi.org/10.1109/ICCI-CC.2017.8109733

Named Entity Recognition (NER) is an important natural language processing (NLP) tool for information extraction and retrieval from unstructured texts such as newspapers, blogs and emails. NER involves processing unstructured text for classification... Read More about Persian Named Entity Recognition.

Complex-valued computational model of hippocampal CA3 recurrent collaterals (2017)
Presentation / Conference Contribution
Shiva, A., Gogate, M., Howard, N., Graham, B., & Hussain, A. (2017). Complex-valued computational model of hippocampal CA3 recurrent collaterals. . https://doi.org/10.1109/ICCI-CC.2017.8109745

Complex planes are known to simplify the complexity of real world problems, providing a better comprehension of their functionality and design. The need for complex numbers in both artificial and biological neural networks is equally well established... Read More about Complex-valued computational model of hippocampal CA3 recurrent collaterals.