Generalized approach to sentiment analysis of short text messages in natural language processing

Evrenii Viktorovich Polyakov; Leonid Sergeevich Voskov; Pavel Sergeevich Abramov; Sergey Viktorovich Polyakov

doi:10.31799/1684-8853-2020-1-2-14

Polyakov Evrenii
Voskov Leonid
Abramov Pavel
Polyakov Sergey

DOI:

https://doi.org/10.31799/1684-8853-2020-1-2-14

Keywords:

natural language processing, machine learning, deep learning, vectorization, modeling, pre-processing, automatic machine learning, transfer learning

Abstract

Introduction: Sentiment analysis is a complex problem whose solution essentially depends on the context, field of study and
amount of text data. Analysis of publications shows that the authors often do not use the full range of possible data transformations
and their combinations. Only a part of the transformations is used, limiting the ways to develop high-quality classification models.
Purpose: Developing and exploring a generalized approach to building a model, which consists in sequentially passing through
the stages of exploratory data analysis, obtaining a basic solution, vectorization, preprocessing, hyperparameter optimization, and
modeling. Results: Comparative experiments conducted using a generalized approach for classical machine learning and deep
learning algorithms in order to solve the problem of sentiment analysis of short text messages in natural language processing
have demonstrated that the classification quality grows from one stage to another. For classical algorithms, such an increase
in quality was insignificant, but for deep learning, it was 8% on average at each stage. Additional studies have shown that the
use of automatic machine learning which uses classical classification algorithms is comparable in quality to manual model
development; however, it takes much longer. The use of transfer learning has a small but positive effect on the classification
quality. Practical relevance: The proposed sequential approach can significantly improve the quality of models under development
in natural language processing problems.

Information processing and control

Generalized approach to sentiment analysis of short text messages in natural language processing

DOI:

Keywords:

Abstract

Published

How to Cite

Issue

Section

Impact Factor

Navigate

In the Web

In the Web