Incorporating conditional random fields and active learning to improve sentiment identification

Kunpeng Zhang; Yusheng Xie; Yi Yang; Aaron Sun; Hengchang Liu; Alok Choudhary

doi:10.1016/j.neunet.2014.04.005

Incorporating conditional random fields and active learning to improve sentiment identification

Neural Netw. 2014 Oct:58:60-7. doi: 10.1016/j.neunet.2014.04.005. Epub 2014 May 10.

Authors

Kunpeng Zhang¹, Yusheng Xie², Yi Yang³, Aaron Sun⁴, Hengchang Liu⁵, Alok Choudhary⁶

Affiliations

¹ University of Illinois at Chicago, Chicago, IL, USA. Electronic address: kzhang6@uic.edu.
² Northwestern University, Evanston, IL, USA. Electronic address: yushengxie2011@u.northwestern.edu.
³ Northwestern University, Evanston, IL, USA. Electronic address: yya518@eecs.northwestern.edu.
⁴ Cloud Research Lab, Samsung Research America, San Jose, CA, USA. Electronic address: a.sun@samsung.com.
⁵ University of Science and Technology of China, Hefei, China. Electronic address: hcliu@ustc.edu.cn.
⁶ Northwestern University, Evanston, IL, USA. Electronic address: choudhar@eecs.northwestern.edu.

PMID: 24856246
DOI: 10.1016/j.neunet.2014.04.005

Abstract

Many machine learning, statistical, and computational linguistic methods have been developed to identify sentiment of sentences in documents, yielding promising results. However, most of state-of-the-art methods focus on individual sentences and ignore the impact of context on the meaning of a sentence. In this paper, we propose a method based on conditional random fields to incorporate sentence structure and context information in addition to syntactic information for improving sentiment identification. We also investigate how human interaction affects the accuracy of sentiment labeling using limited training data. We propose and evaluate two different active learning strategies for labeling sentiment data. Our experiments with the proposed approach demonstrate a 5%-15% improvement in accuracy on Amazon customer reviews compared to existing supervised learning and rule-based methods.

Keywords: Active learning; Conditional random fields; Customer reviews; Sentiment analysis.

MeSH terms

Artificial Intelligence*
Humans
Linguistics*
Pattern Recognition, Visual*
Problem-Based Learning / methods*