Skip to main content

Research Repository

Advanced Search

All Outputs (27)

Leveraging contextual representations with BiLSTM-based regressor for lexical complexity prediction (2023)
Journal Article
Aziz, A., Hossain, M. A., Chy, A. N., Ullah, M. Z., & Aono, M. (2023). Leveraging contextual representations with BiLSTM-based regressor for lexical complexity prediction. Natural Language Processing Journal, 5, Article 100039. https://doi.org/10.1016/j.nlp.2023.100039

Lexical complexity prediction (LCP) determines the complexity level of words or phrases in a sentence. LCP has a significant impact on the enhancement of language translations, readability assessment, and text generation. However, the domain-specific... Read More about Leveraging contextual representations with BiLSTM-based regressor for lexical complexity prediction.

Selective Query Processing: A Risk-Sensitive Selection of Search Configurations (2023)
Journal Article
Mothe, J., & Ullah, M. Z. (2024). Selective Query Processing: A Risk-Sensitive Selection of Search Configurations. ACM transactions on information systems, 42(1), https://doi.org/10.1145/3608474

In information retrieval systems, search parameters are optimized to ensure high effectiveness based on a set of past searches and these optimized parameters are then used as the system configuration for all subsequent queries. A better approach, how... Read More about Selective Query Processing: A Risk-Sensitive Selection of Search Configurations.

Comparison of machine learning models for early depression detection from users’ posts (2022)
Book Chapter
Mothe, J., Ramiandrisoa, F., & Ullah, M. Z. (2022). Comparison of machine learning models for early depression detection from users’ posts. In F. Crestani, D. E. Losada, & J. Parapar (Eds.), Early Detection of Mental Health Disorders by Social Media Monitoring: The First Five Years of the eRisk Project (111-139). Springer. https://doi.org/10.1007/978-3-031-04431-1_5

With around 300 millions people worldwide suffering from depression, the detection of this disorder is crucial and a challenge for individual and public health. As with many diseases, early detection means better medical management; the use of social... Read More about Comparison of machine learning models for early depression detection from users’ posts.

Instruments and Tools to Identify Radical Textual Content (2022)
Journal Article
Mothe, J., Ullah, M. Z., Okon, G., Schweer, T., Juršėnas, A., & Mandravickaitė, J. (2022). Instruments and Tools to Identify Radical Textual Content. Information, 13(4), Article 193. https://doi.org/10.3390/info13040193

The Internet and social networks are increasingly becoming a media of extremist propaganda. On homepages, in forums or chats, extremists spread their ideologies and world views, which are often contrary to the basic liberal democratic values of the E... Read More about Instruments and Tools to Identify Radical Textual Content.

Query expansion for microblog retrieval focusing on an ensemble of features (2019)
Journal Article
Chy, A. N., Ullah, M. Z., & Aono, M. (2019). Query expansion for microblog retrieval focusing on an ensemble of features. Journal of Information Processing, 27, 61-76. https://doi.org/10.2197/ipsjjip.27.61

In microblog search, vocabulary mismatch is a persisting problem due to the brevity of tweets and frequent use of unconventional abbreviations. One way of alleviating this problem is to reformulate the query via query expansion. However, finding good... Read More about Query expansion for microblog retrieval focusing on an ensemble of features.

Learning to adaptively rank document retrieval system configurations (2018)
Journal Article
Deveaud, R., Mothe, J., Ullah, M. Z., & Nie, J. (2019). Learning to adaptively rank document retrieval system configurations. ACM transactions on information systems, 37(1), Article 3. https://doi.org/10.1145/3231937

Modern Information Retrieval (IR) systems have become more and more complex, involving a large number of parameters. For example, a system may choose from a set of possible retrieval models (BM25, language model, etc.), or various query expansion par... Read More about Learning to adaptively rank document retrieval system configurations.

Microblog Retrieval Using Ensemble of Feature Sets through Supervised Feature Selection (2017)
Journal Article
Chy, A. N., Ullah, M. Z., & Aono, M. (2017). Microblog Retrieval Using Ensemble of Feature Sets through Supervised Feature Selection. IEICE Transactions on Information and Systems, 100(4), 793-806. https://doi.org/10.1587/transinf.2016DAP0032

Microblog, especially twitter, has become an integral part of our daily life for searching latest news and events information. Due to the short length characteristics of tweets and frequent use of unconventional abbreviations, content-relevance based... Read More about Microblog Retrieval Using Ensemble of Feature Sets through Supervised Feature Selection.

A bipartite graph-based ranking approach to query subtopics diversification focused on word embedding features (2016)
Journal Article
Ullah, M. Z., & Aono, M. (2016). A bipartite graph-based ranking approach to query subtopics diversification focused on word embedding features. IEICE Transactions on Information and Systems, 99(12), 3090-3100. https://doi.org/10.1587/transinf.2016EDP7190

Web search queries are usually vague, ambiguous, or tend to have multiple intents. Users have different search intents while issuing the same query. Understanding the intents through mining subtopics underlying a query has gained much interest in rec... Read More about A bipartite graph-based ranking approach to query subtopics diversification focused on word embedding features.

Estimating a Ranked List of Human Genetic Diseases by Associating Phenotype-Gene with Gene-Disease Bipartite Graphs (2015)
Journal Article
Ullah, M. Z., Aono, M., & Seddiqui, M. H. (2015). Estimating a Ranked List of Human Genetic Diseases by Associating Phenotype-Gene with Gene-Disease Bipartite Graphs. ACM transactions on intelligent systems and technology, 6(4), Article 56. https://doi.org/10.1145/2700487

With vast amounts of medical knowledge available on the Internet, it is becoming increasingly practical to help doctors in clinical diagnostics by suggesting plausible diseases predicted by applying data and text mining technologies. Recently, Genome... Read More about Estimating a Ranked List of Human Genetic Diseases by Associating Phenotype-Gene with Gene-Disease Bipartite Graphs.

Combining Word Embedding Interactions and LETOR Feature Evidences for Supervised QPP
Presentation / Conference Contribution
Datta, S., Ganguly, D., Mothe, J., & Ullah, M. Z. (2023, April). Combining Word Embedding Interactions and LETOR Feature Evidences for Supervised QPP. Presented at 45th European Conference on Information Retrieval (ECIR), Dublin, Ireland

In information retrieval, query performance prediction aims to predict whether a search engine is likely to succeed in retrieving potentially relevant documents to a user's query. This problem is usually cast into a regression problem where a machine... Read More about Combining Word Embedding Interactions and LETOR Feature Evidences for Supervised QPP.

An ML Model for Predicting Information Check-Worthiness using a Variety of Features
Presentation / Conference Contribution
Ullah, M. Z. (2020, February). An ML Model for Predicting Information Check-Worthiness using a Variety of Features. Presented at Workshop on Machine Learning for Trend and Weak Signal Detection in Social Networks and Social Media, Toulouse, France

In this communication, we introduce the important problem of information check-worthiness. We present the method we developed to automatically answer this problem. This method makes use of an elaborated information representation that combines the “i... Read More about An ML Model for Predicting Information Check-Worthiness using a Variety of Features.

InnEO'Space PhD: Preparing Young Researchers for a successful career on Earth Observation applications
Presentation / Conference Contribution
Mothe, J., Bayer, A., Castello, V., Ciaccio, V., Del Frate, F., De Santis, D., Ivanovici, M., Lehuerou Kerisel, A., Necşoi, D., Nzeh Ndong, A., Neptune, N., Perier-Camby, M., Recchioni, M., Ullah, M. Z., & Voinea, M. (2021, September). InnEO'Space PhD: Preparing Young Researchers for a successful career on Earth Observation applications. Presented at International Conference on Innovation in Aviation & Space to the Satisfaction of the European Citizens (11th EASN 2021), Salerno

InnEO'Space PhD project is preparing young researchers for a successful career by developing modernised and transferable PhD courses and learning resources based on innovation skills and employers' needs as well as in-depth knowledge of high stakes a... Read More about InnEO'Space PhD: Preparing Young Researchers for a successful career on Earth Observation applications.

Defining an Optimal Configuration Set for Selective Search Strategy - A Risk-Sensitive Approach
Presentation / Conference Contribution
Mothe, J., & Ullah, M. Z. (2021, November). Defining an Optimal Configuration Set for Selective Search Strategy - A Risk-Sensitive Approach. Presented at 30th ACM International Conference on Information & Knowledge Management, Queensland, Australia

A search engine generally applies a single search strategy to any user query. The search combines many component processes (e.g., indexing, query expansion, search-weighting model, document ranking) and their hyperparameters, whose values are optimiz... Read More about Defining an Optimal Configuration Set for Selective Search Strategy - A Risk-Sensitive Approach.

Exploiting various word embedding models for query expansion in microblog
Presentation / Conference Contribution
Ahmed, S., Chy, A. N., & Ullah, M. Z. (2020, December). Exploiting various word embedding models for query expansion in microblog. Presented at 2020 IEEE 8th R10 Humanitarian Technology Conference (R10-HTC), Kuching, Malaysia

Microblogs, especially Twitter, make it easier to communicate with others in a real-time manner and is treated as a valuable information source. With the increasing amount of tweets, it would be fascinating to be able to extract essential information... Read More about Exploiting various word embedding models for query expansion in microblog.

Prediction and Visual Intelligence for Security Information: The PREVISION H2020 Project
Presentation / Conference Contribution
Demestichas, K., Hoang, T. B. N., Mothe, J., Teste, O., & Ullah, M. Z. (2020, July). Prediction and Visual Intelligence for Security Information: The PREVISION H2020 Project. Presented at CIRCLE'20, Samatan, France

This paper presents the on going work within PREVISION H2020 project. The mission of PREVISION is to empower the analysts and investigators of agencies with tools and solutions not commercially available today, to handle and capitalize on the massive... Read More about Prediction and Visual Intelligence for Security Information: The PREVISION H2020 Project.

Forward and backward feature selection for query performance prediction
Presentation / Conference Contribution
Déjean, S., Ionescu, R. T., Mothe, J., & Ullah, M. Z. (2020, March). Forward and backward feature selection for query performance prediction. Presented at 35th Annual ACM Symposium on Applied Computing, Brno, Czech Republic

The goal of query performance prediction (QPP) is to automatically estimate the effectiveness of a search result for any given query, without relevance judgements. Post-retrieval features have been shown to be more effective for this task while being... Read More about Forward and backward feature selection for query performance prediction.

Studying the variability of system setting effectiveness by data analytics and visualization
Presentation / Conference Contribution
Déjean, S., Mothe, J., & Ullah, M. Z. (2019, September). Studying the variability of system setting effectiveness by data analytics and visualization. Presented at Experimental IR Meets Multilinguality, Multimodality, and Interaction: 10th International Conference of the CLEF Association (CLEF 2019), Lugano, Switzerland

Search engines differ from their modules and parameters; defining the optimal system setting is challenging the more because of the complexity of a retrieval stream. The main goal of this study is to determine which are the most important system comp... Read More about Studying the variability of system setting effectiveness by data analytics and visualization.

Information nutritional label and word embedding to estimate information check-worthiness
Presentation / Conference Contribution
Lespagnol, C., Mothe, J., & Ullah, M. Z. (2019, July). Information nutritional label and word embedding to estimate information check-worthiness. Presented at 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris

Automatic fact-checking is an important challenge nowadays since anyone can write about anything and spread it in social media, no matter the information quality. In this paper, we revisit the information check-worthiness problem and propose a method... Read More about Information nutritional label and word embedding to estimate information check-worthiness.

Query performance prediction focused on summarized letor features
Presentation / Conference Contribution
Chifu, A.-G., Laporte, L., Mothe, J., & Ullah, M. Z. (2018, July). Query performance prediction focused on summarized letor features. Presented at SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA

Query performance prediction (QPP) aims at automatically estimating the information retrieval system effectiveness for any user's query. Previous work has investigated several types of pre- and post-retrieval query performance predictors; the latter... Read More about Query performance prediction focused on summarized letor features.

Query performance prediction and effectiveness evaluation without relevance judgments: Two sides of the same coin
Presentation / Conference Contribution
Mizzaro, S., Mothe, J., Roitero, K., & Ullah, M. Z. (2018, July). Query performance prediction and effectiveness evaluation without relevance judgments: Two sides of the same coin. Presented at SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA

Some methods have been developed for automatic effectiveness evaluation without relevance judgments. We propose to use those methods, and their combination based on a machine learning approach, for query performance prediction. Moreover, since predic... Read More about Query performance prediction and effectiveness evaluation without relevance judgments: Two sides of the same coin.