Skip to main content

Research Repository

Advanced Search

All Outputs (26)

Combining Word Embedding Interactions and LETOR Feature Evidences for Supervised QPP (2023)
Conference Proceeding
Datta, S., Ganguly, D., Mothe, J., & Ullah, M. Z. (2023). Combining Word Embedding Interactions and LETOR Feature Evidences for Supervised QPP. In Proceedings of the The QPP++ 2023: Query Performance Prediction and Its Evaluation in New Tasks Workshop co-located with The 45th European Conference on Information Retrieval (ECIR) (7-12)

In information retrieval, query performance prediction aims to predict whether a search engine is likely to succeed in retrieving potentially relevant documents to a user's query. This problem is usually cast into a regression problem where a machine... Read More about Combining Word Embedding Interactions and LETOR Feature Evidences for Supervised QPP.

Can we predict QPP? An approach based on multivariate outliers (2023)
Conference Proceeding
Ullah, M. Z. (in press). Can we predict QPP? An approach based on multivariate outliers.

Query performance prediction (QPP) aims to predict the success and failure of a search engine on a collection of queries and documents. State of the art predictors can enable this prediction with a degree of accuracy; however, it is far from being pe... Read More about Can we predict QPP? An approach based on multivariate outliers.

Leveraging contextual representations with BiLSTM-based regressor for lexical complexity prediction (2023)
Journal Article
Aziz, A., Hossain, M. A., Chy, A. N., Ullah, M. Z., & Aono, M. (2023). Leveraging contextual representations with BiLSTM-based regressor for lexical complexity prediction. Natural Language Processing Journal, 5, Article 100039. https://doi.org/10.1016/j.nlp.2023.100039

Lexical complexity prediction (LCP) determines the complexity level of words or phrases in a sentence. LCP has a significant impact on the enhancement of language translations, readability assessment, and text generation. However, the domain-specific... Read More about Leveraging contextual representations with BiLSTM-based regressor for lexical complexity prediction.

Selective Query Processing: A Risk-Sensitive Selection of Search Configurations (2023)
Journal Article
Mothe, J., & Ullah, M. Z. (2024). Selective Query Processing: A Risk-Sensitive Selection of Search Configurations. ACM transactions on information systems, 42(1), https://doi.org/10.1145/3608474

In information retrieval systems, search parameters are optimized to ensure high effectiveness based on a set of past searches and these optimized parameters are then used as the system configuration for all subsequent queries. A better approach, how... Read More about Selective Query Processing: A Risk-Sensitive Selection of Search Configurations.

InnEO'Space PhD: Preparing Young Researchers for a successful career on Earth Observation applications (2022)
Conference Proceeding
Mothe, J., Bayer, A., Castello, V., Ciaccio, V., Del Frate, F., De Santis, D., …Voinea, M. (2022). InnEO'Space PhD: Preparing Young Researchers for a successful career on Earth Observation applications. In International Conference on Innovation in Aviation & Space to the Satisfaction of the European Citizens (11th EASN 2021) (012084). https://doi.org/10.1088/1757-899X/1226/1/012084

InnEO'Space PhD project is preparing young researchers for a successful career by developing modernised and transferable PhD courses and learning resources based on innovation skills and employers' needs as well as in-depth knowledge of high stakes a... Read More about InnEO'Space PhD: Preparing Young Researchers for a successful career on Earth Observation applications.

Comparison of machine learning models for early depression detection from users’ posts (2022)
Book Chapter
Mothe, J., Ramiandrisoa, F., & Ullah, M. Z. (2022). Comparison of machine learning models for early depression detection from users’ posts. In F. Crestani, D. E. Losada, & J. Parapar (Eds.), Early Detection of Mental Health Disorders by Social Media Monitoring: The First Five Years of the eRisk Project (111-139). Cham: Springer. https://doi.org/10.1007/978-3-031-04431-1_5

With around 300 millions people worldwide suffering from depression, the detection of this disorder is crucial and a challenge for individual and public health. As with many diseases, early detection means better medical management; the use of social... Read More about Comparison of machine learning models for early depression detection from users’ posts.

Instruments and Tools to Identify Radical Textual Content (2022)
Journal Article
Mothe, J., Ullah, M. Z., Okon, G., Schweer, T., Juršėnas, A., & Mandravickaitė, J. (2022). Instruments and Tools to Identify Radical Textual Content. Information, 13(4), Article 193. https://doi.org/10.3390/info13040193

The Internet and social networks are increasingly becoming a media of extremist propaganda. On homepages, in forums or chats, extremists spread their ideologies and world views, which are often contrary to the basic liberal democratic values of the E... Read More about Instruments and Tools to Identify Radical Textual Content.

Defining an Optimal Configuration Set for Selective Search Strategy - A Risk-Sensitive Approach (2021)
Conference Proceeding
Mothe, J., & Ullah, M. Z. (2021). Defining an Optimal Configuration Set for Selective Search Strategy - A Risk-Sensitive Approach. In CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management (1335-1345). https://doi.org/10.1145/3459637.3482422

A search engine generally applies a single search strategy to any user query. The search combines many component processes (e.g., indexing, query expansion, search-weighting model, document ranking) and their hyperparameters, whose values are optimiz... Read More about Defining an Optimal Configuration Set for Selective Search Strategy - A Risk-Sensitive Approach.

Exploiting various word embedding models for query expansion in microblog (2020)
Conference Proceeding
Ahmed, S., Chy, A. N., & Ullah, M. Z. (2020). Exploiting various word embedding models for query expansion in microblog. In 2020 IEEE 8th R10 Humanitarian Technology Conference (R10-HTC). https://doi.org/10.1109/R10-HTC49770.2020.9357016

Microblogs, especially Twitter, make it easier to communicate with others in a real-time manner and is treated as a valuable information source. With the increasing amount of tweets, it would be fascinating to be able to extract essential information... Read More about Exploiting various word embedding models for query expansion in microblog.

An ML Model for Predicting Information Check-Worthiness using a Variety of Features (2020)
Conference Proceeding
Ullah, M. Z. (2020). An ML Model for Predicting Information Check-Worthiness using a Variety of Features. In Proceedings of the Workshop on Machine Learning for Trend and Weak Signal Detection in Social Networks and Social Media (56-61)

In this communication, we introduce the important problem of information check-worthiness. We present the method we developed to automatically answer this problem. This method makes use of an elaborated information representation that combines the “i... Read More about An ML Model for Predicting Information Check-Worthiness using a Variety of Features.

Forward and backward feature selection for query performance prediction (2020)
Conference Proceeding
Déjean, S., Ionescu, R. T., Mothe, J., & Ullah, M. Z. (2020). Forward and backward feature selection for query performance prediction. In SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing (690-697). https://doi.org/10.1145/3341105.3373904

The goal of query performance prediction (QPP) is to automatically estimate the effectiveness of a search result for any given query, without relevance judgements. Post-retrieval features have been shown to be more effective for this task while being... Read More about Forward and backward feature selection for query performance prediction.

Query expansion for microblog retrieval focusing on an ensemble of features (2019)
Journal Article
Chy, A. N., Ullah, M. Z., & Aono, M. (2019). Query expansion for microblog retrieval focusing on an ensemble of features. Journal of Information Processing, 27, 61-76. https://doi.org/10.2197/ipsjjip.27.61

In microblog search, vocabulary mismatch is a persisting problem due to the brevity of tweets and frequent use of unconventional abbreviations. One way of alleviating this problem is to reformulate the query via query expansion. However, finding good... Read More about Query expansion for microblog retrieval focusing on an ensemble of features.

Studying the variability of system setting effectiveness by data analytics and visualization (2019)
Conference Proceeding
Déjean, S., Mothe, J., & Ullah, M. Z. (2019). Studying the variability of system setting effectiveness by data analytics and visualization. In Experimental IR Meets Multilinguality, Multimodality, and Interaction: 10th International Conference of the CLEF Association, CLEF 2019, Lugano, Switzerland, September 9--12, 2019, Proceedings (62-74). https://doi.org/10.1007/978-3-030-28577-7_3

Search engines differ from their modules and parameters; defining the optimal system setting is challenging the more because of the complexity of a retrieval stream. The main goal of this study is to determine which are the most important system comp... Read More about Studying the variability of system setting effectiveness by data analytics and visualization.

Information nutritional label and word embedding to estimate information check-worthiness (2019)
Conference Proceeding
Lespagnol, C., Mothe, J., & Ullah, M. Z. (2019). Information nutritional label and word embedding to estimate information check-worthiness. In SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (941-944). https://doi.org/10.1145/3331184.3331298

Automatic fact-checking is an important challenge nowadays since anyone can write about anything and spread it in social media, no matter the information quality. In this paper, we revisit the information check-worthiness problem and propose a method... Read More about Information nutritional label and word embedding to estimate information check-worthiness.

Learning to adaptively rank document retrieval system configurations (2018)
Journal Article
Deveaud, R., Mothe, J., Ullah, M. Z., & Nie, J. (2019). Learning to adaptively rank document retrieval system configurations. ACM transactions on information systems, 37(1), Article 3. https://doi.org/10.1145/3231937

Modern Information Retrieval (IR) systems have become more and more complex, involving a large number of parameters. For example, a system may choose from a set of possible retrieval models (BM25, language model, etc.), or various query expansion par... Read More about Learning to adaptively rank document retrieval system configurations.

Query performance prediction and effectiveness evaluation without relevance judgments: Two sides of the same coin (2018)
Conference Proceeding
Mizzaro, S., Mothe, J., Roitero, K., & Ullah, M. Z. (2018). Query performance prediction and effectiveness evaluation without relevance judgments: Two sides of the same coin. In SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (1233-1236). https://doi.org/10.1145/3209978.3210146

Some methods have been developed for automatic effectiveness evaluation without relevance judgments. We propose to use those methods, and their combination based on a machine learning approach, for query performance prediction. Moreover, since predic... Read More about Query performance prediction and effectiveness evaluation without relevance judgments: Two sides of the same coin.

Query performance prediction focused on summarized letor features (2018)
Conference Proceeding
Chifu, A., Laporte, L., Mothe, J., & Ullah, M. Z. (2018). Query performance prediction focused on summarized letor features. In SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (1177-1180). https://doi.org/10.1145/3209978.3210121

Query performance prediction (QPP) aims at automatically estimating the information retrieval system effectiveness for any user's query. Previous work has investigated several types of pre- and post-retrieval query performance predictors; the latter... Read More about Query performance prediction focused on summarized letor features.

IRIT-QFR: IRIT Query Feature Resource (2017)
Conference Proceeding
Molina, S., Mothe, J., Roques, D., Tanguy, L., & Ullah, M. Z. (2017). IRIT-QFR: IRIT Query Feature Resource. In Experimental IR Meets Multilinguality, Multimodality, and Interaction: 8th International Conference of the CLEF Association, CLEF 2017, Dublin, Ireland, September 11–14, 2017, Proceedings (69-81). https://doi.org/10.1007/978-3-319-65813-1_6

In this paper, we present a resource that consists of query features associated with TREC adhoc collections. We developed two types of query features: linguistics features that can be calculated from the query itself, prior to any search although som... Read More about IRIT-QFR: IRIT Query Feature Resource.

Microblog Retrieval Using Ensemble of Feature Sets through Supervised Feature Selection (2017)
Journal Article
Chy, A. N., Ullah, M. Z., & Aono, M. (2017). Microblog Retrieval Using Ensemble of Feature Sets through Supervised Feature Selection. IEICE Transactions on Information and Systems, 100(4), 793-806. https://doi.org/10.1587/transinf.2016DAP0032

Microblog, especially twitter, has become an integral part of our daily life for searching latest news and events information. Due to the short length characteristics of tweets and frequent use of unconventional abbreviations, content-relevance based... Read More about Microblog Retrieval Using Ensemble of Feature Sets through Supervised Feature Selection.

A bipartite graph-based ranking approach to query subtopics diversification focused on word embedding features (2016)
Journal Article
Ullah, M. Z., & Aono, M. (2016). A bipartite graph-based ranking approach to query subtopics diversification focused on word embedding features. IEICE Transactions on Information and Systems, 99(12), 3090-3100. https://doi.org/10.1587/transinf.2016EDP7190

Web search queries are usually vague, ambiguous, or tend to have multiple intents. Users have different search intents while issuing the same query. Understanding the intents through mining subtopics underlying a query has gained much interest in rec... Read More about A bipartite graph-based ranking approach to query subtopics diversification focused on word embedding features.