Semi-supervised word polarity identification in resource-lean languages

Neural Netw. 2014 Oct:58:50-9. doi: 10.1016/j.neunet.2014.05.018. Epub 2014 Jun 4.

Abstract

Sentiment words, as fundamental constitutive parts of subjective sentences, have a substantial effect on analysis of opinions, emotions and beliefs. Most of the proposed methods for identifying the semantic orientations of words exploit rich linguistic resources such as WordNet, subjectivity corpora, or polarity tagged words. Shortage of such linguistic resources in resource-lean languages affects the performance of word polarity identification in these languages. In this paper, we present a method which exploits a language with rich subjectivity analysis resources (English) to identify the polarity of words in a resource-lean foreign language. The English WordNet and a sparse foreign WordNet infrastructure are used to create a heterogeneous, multilingual and weighted semantic network. To identify the semantic orientation of foreign words, a random walk based method is applied to the semantic network along with a set of automatically weighted English positive and negative seeds. In a post-processing phase, synonym and antonym relations in the foreign WordNet are used to filter the random walk results. Our experiments on English and Persian languages show that the proposed method can outperform state-of-the-art word polarity identification methods in both languages.

Keywords: Random walk model; Semi-supervised polarity identification; Sentiment lexicon.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Culture*
  • Humans
  • Language*
  • Linguistics / methods*
  • Semantics*