Abdul Aziz
Leveraging contextual representations with BiLSTM-based regressor for lexical complexity prediction
Aziz, Abdul; Hossain, Md. Akram; Chy, Abu Nowshed; Ullah, Md. Zia; Aono, Masaki
Abstract
Lexical complexity prediction (LCP) determines the complexity level of words or phrases in a sentence. LCP has a significant impact on the enhancement of language translations, readability assessment, and text generation. However, the domain-specific technical word, the complex grammatical structure, the polysemy problem, the inter-word relationship, and dependencies make it challenging to determine the complexity of words or phrases. In this paper, we propose an integrated transformer regressor model named ITRM-LCP to estimate the lexical complexity of words and phrases where diverse contextual features are extracted from various transformer models. The transformer models are fine-tuned using the text-pair data. Then, a bidirectional LSTM-based regressor module is plugged on top of each transformer to learn the long-term dependencies and estimate the complexity scores. The predicted scores of each module are then aggregated to determine the final complexity score. We assess our proposed model using two benchmark datasets from shared tasks. Experimental findings demonstrate that our ITRM-LCP model obtains 10.2% and 8.2% improvement on the news and Wikipedia corpus of the CWI-2018 dataset, compared to the top-performing systems (DAT, CAMB, and TMU). Additionally, our ITRM-LCP model surpasses state-of-the-art LCP systems (DeepBlueAI, JUST-BLUE) by 1.5% and 1.34% for single and multi-word LCP tasks defined in the SemEval LCP-2021 task.
Citation
Aziz, A., Hossain, M. A., Chy, A. N., Ullah, M. Z., & Aono, M. (2023). Leveraging contextual representations with BiLSTM-based regressor for lexical complexity prediction. Natural Language Processing Journal, 5, Article 100039. https://doi.org/10.1016/j.nlp.2023.100039
Journal Article Type | Article |
---|---|
Acceptance Date | Oct 26, 2023 |
Online Publication Date | Nov 3, 2023 |
Publication Date | 2023-12 |
Deposit Date | Nov 6, 2023 |
Publicly Available Date | Nov 6, 2023 |
Print ISSN | 2949-7191 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 5 |
Article Number | 100039 |
DOI | https://doi.org/10.1016/j.nlp.2023.100039 |
Keywords | Lexical complexity prediction, Lexical simplification, Sentence-pair regression, Transformer models |
Public URL | http://researchrepository.napier.ac.uk/Output/3370140 |
Files
Leveraging contextual representations with BiLSTM-based regressor for lexical complexity prediction
(1.5 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by-nc-nd/4.0/
You might also like
Instruments and Tools to Identify Radical Textual Content
(2022)
Journal Article
Query expansion for microblog retrieval focusing on an ensemble of features
(2019)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search