Publication

Acquiring Broad Commonsense Knowledge for Sentiment Analysis Using Human Computation

Marina Boia
2016
EPFL thesis
Abstract

While artificial intelligence is successful in many applications that cover specific domains, for many commonsense problems there is still a large gap with human performance. Automated sentiment analysis is a typical example: while there are techniques that reasonably aggregate sentiments from texts in specific domains, such as online reviews of a particular product category, more general models have a poor performance. We argue that sentiment analysis can be covered more broadly by extending models with commonsense knowledge acquired at scale, using human computation. We study two sentiment analysis problems. We start with document-level sentiment classification, which aims to determine whether a text as a whole expresses a positive or a negative sentiment. We hypothesize that extending classifiers to include the polarities of sentiment words in context can help them scale to broad domains. We also study fine-grained opinion extraction, which aims to pinpoint individual opinions in a text, along with their targets. We hypothesize that extraction models can benefit from broad fine-grained annotations to boost their performance on unfamiliar domains. Selecting sentiment words in context and annotating texts with opinions and targets are tasks that require commonsense knowledge shared by all the speakers of a language. We show how these can be effectively solved through human computation. We illustrate how to define small tasks that can be solved by many independent workers so that results can form a single coherent knowledge base. We also show how to recruit, train, and engage workers, then how to perform effective quality control to obtain sufficiently high-quality knowledge. We show how the resulting knowledge can be effectively integrated into models that scale to broad domains and also perform well in unfamiliar domains. We engage workers through both enjoyment and payment, by designing our tasks as games played for money. We recruit them on a paid crowdsourcing platform where we can reach out to a large pool of active workers. This is an effective recipe for acquiring sentiment knowledge in English, a language that is known by the vast majority of workers on the platform. To acquire sentiment knowledge for other languages, which have received comparatively little attention, we argue that we need to design tasks that appeal to voluntary workers outside the crowdsourcing platform, based on enjoyment alone. However, recruiting and engaging volunteers has been more of an art than a problem that can be solved systematically. We show that combining online advertisement with games, an approach that has been recently proved to work well for acquiring expert knowledge, gives an effective recipe for luring and engaging volunteers to provide good quality sentiment knowledge for texts in French. Our solutions could point the way to how to use human computation to broaden the competence of artificial intelligence systems in other domains as well.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (42)
Sentiment analysis
Sentiment analysis (also known as opinion mining or emotion AI) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine.
Knowledge
Knowledge is a form of awareness or familiarity. It is often understood as awareness of facts or as practical skills, and may also mean familiarity with objects or situations. Knowledge of facts, also called propositional knowledge, is often defined as true belief that is distinct from opinion or guesswork by virtue of justification. While there is wide agreement among philosophers that propositional knowledge is a form of true belief, many controversies in philosophy focus on justification.
Knowledge extraction
Knowledge extraction is the creation of knowledge from structured (relational databases, XML) and unstructured (text, documents, s) sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL (data warehouse), the main criterion is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema.
Show more
Related publications (52)

Ontology-based Knowledge Representation for Traditional Martial Arts

Sarah Irene Brutton Kenderdine, Yumeng Hou

Traditional martial arts are treasures of humanity's knowledge and critical carriers of sociocultural memories throughout history. However, such treasured practices have encountered various challenges in knowledge transmission and now feature many entries ...
2024

Unlocking a multimodal archive of Southern Chinese martial arts through embodied cues

Sarah Irene Brutton Kenderdine, Yumeng Hou, Fadel Mamar Seydou

Purpose: Despite being an authentic carrier of various cultural practices, the human body is often underutilised to access the knowledge of human body. Digital inventions today have created new avenues to open up cultural data resources, yet mainly as appa ...
2023

The Facets of Intangible Heritage in Southern Chinese Martial Arts: Applying a Knowledge-Driven Cultural Contact Detection Approach

Yumeng Hou

Investigating the intangible nature of a cultural domain can take multiple forms, addressing for example the aesthetic, epistemic and social dimensions of its phenomenology. The context of Southern Chinese martial arts is of particular significance as it c ...
2023
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.