Refining Automatically Extracted Knowledge Bases Using Crowdsourcing

Chunhua Li; Pengpeng Zhao; Victor S Sheng; Xuefeng Xian; Jian Wu; Zhiming Cui

doi:10.1155/2017/4092135

Refining Automatically Extracted Knowledge Bases Using Crowdsourcing

Comput Intell Neurosci. 2017:2017:4092135. doi: 10.1155/2017/4092135. Epub 2017 May 14.

Authors

Chunhua Li¹, Pengpeng Zhao¹, Victor S Sheng², Xuefeng Xian³, Jian Wu¹, Zhiming Cui¹

Affiliations

¹ School of Computer Science and Technology, Soochow University, Suzhou 215006, China.
² Computer Science Department, University of Central Arkansas, Conway, AR, USA.
³ College of Computer Engineering, Suzhou Vocational University, Suzhou 215104, China.

Abstract

Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost.

MeSH terms

Algorithms*
Automation
Crowdsourcing* / economics
Knowledge Bases*
Semantics