Skip to main content

Research Repository

Advanced Search

NgramPOS: A Bigram-based Linguistic and Statistical Feature Process Model for Unstructured Text Classification

Yazdania, Sepideh; Tan, Zhiyuan; Kakavand, Mohsen; Lau, Sian

Authors

Sepideh Yazdania

Mohsen Kakavand

Sian Lau



Abstract

Research in financial domain has shown that sentiment aspects of stock news have a profound impact on volume trades, volatility, stock prices and firm earnings. With the ever growing social inetworking and online marketing sites, the reviews obtained from those, act as an important source for further analysis and improved decision making. These reviews are mostly unstructured by nature and thus, need processing like clustering or classification to provide different polarity categories such as positive and negative in order to extract a meaningful information for future uses. Accordingly, in this study we investigate the use of Natural Language processing (NLP) in a way to improve the sentiment classification performance to evaluate the information content of financial news as an instrument for using in investment
decisions system.
Since the proposed feature extraction approach is based on the occurrence frequency of words, low-frequency linguist features that could be critical in sentiment classification are typically ignored. In this research, therefore, we attempt to improve current sentiment analysis approaches for financial news classification in consideration of low-frequency, informative, linguistic expressions. Our proposed combination of low and high-frequency linguistic expressions contributes a novel set of features for text sentiment analysis and classification. The experimental results show that an optimal Ngram feature selection (combination of optimal unigram and bigram features) enhances sentiment classification accuracy than other types feature sets.

Citation

Yazdania, S., Tan, Z., Kakavand, M., & Lau, S. (2022). NgramPOS: A Bigram-based Linguistic and Statistical Feature Process Model for Unstructured Text Classification. Wireless Networks, 28, 1251-1261. https://doi.org/10.1007/s11276-018-01909-0

Journal Article Type Conference Paper
Acceptance Date Jul 3, 2018
Online Publication Date Dec 11, 2018
Publication Date 2022-04
Deposit Date Jul 17, 2018
Publicly Available Date Aug 24, 2018
Journal Wireless Networks
Print ISSN 1022-0038
Publisher BMC
Peer Reviewed Peer Reviewed
Volume 28
Pages 1251-1261
DOI https://doi.org/10.1007/s11276-018-01909-0
Keywords Unstructured Text Classification, Bigram-based Linguistic and Statistical Feature,
Public URL http://researchrepository.napier.ac.uk/Output/1250784

Files







You might also like



Downloadable Citations