Improving the sentiment classification of stock tweets
[Thesis]
Li, Sheng
University of Birmingham
2014
Thesis (Ph.D.)
2014
This research focuses on improving stock tweet sentiment classification accuracy with the addition of the linguistic features of stock tweets. Stock prediction based on social media data has been popular in recent years, but none of the previous studies have provided a comprehensive understanding of the linguistic features of stock tweets. Hence, applying a simple statistical model to classifying the sentiment of stock tweets has reached a bottleneck. Thus, after analysing the linguistic features of stock tweets, this research used these features to train four machine learning classifiers. Each of them showed an improvement, and the best one achieved a 9.7% improvement compared to the baseline model. The main contributions of this research are fivefold: (a) it provides an in-depth linguistic analysis of stock tweets; (b) it gives a clear and comprehensive definition of stock tweets; (c) it provides a simple but effective way to automatically identify stock tweets; (d) it provides a simple but effective method of generating a localised sentiment keyword list; and (e) it demonstrates a significant improvement of stock tweet sentiment classification accuracy.
HG Finance; HM Sociology; P Philology. Linguistics