Rizun, N.; Taranenko, Y.; Waloszek, W. Improving the Accuracy in Sentiment Classification in the Light of Modelling the Latent Semantic Relations. Information2018, 9, 307.
Rizun, N.; Taranenko, Y.; Waloszek, W. Improving the Accuracy in Sentiment Classification in the Light of Modelling the Latent Semantic Relations. Information 2018, 9, 307.
Rizun, N.; Taranenko, Y.; Waloszek, W. Improving the Accuracy in Sentiment Classification in the Light of Modelling the Latent Semantic Relations. Information2018, 9, 307.
Rizun, N.; Taranenko, Y.; Waloszek, W. Improving the Accuracy in Sentiment Classification in the Light of Modelling the Latent Semantic Relations. Information 2018, 9, 307.
Abstract
The research presents the Methodology of Improving the Accuracy in Text Classification in Light of Modelling the Latent Semantic Relations (LSR). The aim of this Methodology is to find the ways of eliminating the Limitations of Discriminant and Probabilistic methods for LSR revealing and customizing the Text Classification Process to the more accurate recognition of the text tonality. This aim should be achieved by using the knowledge about the text’s Hierarchical Semantic Context in the form of Corpora-based Hierarchical Sentiment Dictionary. The main scientific contribution of this research is the following set of approaches to improve the qualitative characteristics of Text Classification process: combination of the Discriminant and Probabilistic methods allowing to decrease the influences of the Limitations of these methods on the LSR revealing process; considering each document as a complex structure allowing to estimate documents integrally by separated classification of topically completed textual component (paragraphs); taking into account the features of Argumentative type of documents (Reviews) allowing to use the author’s subjective evaluation of text tonality for development the Text Classification methodology. Tonality, expressed by the Review’s author, has a significant, but not critical, effect on the qualitative indicators of Sentiment Recognition.
Keywords
text classification; topic modelling; latent semantic analysis; latent dirichlet allocation; hierarchical sentiment dictionary; contextually-oriented hierarchical corpus; text tonality; evaluation
Subject
Computer Science and Mathematics, Information Systems
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.