Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
Retrieving information from archived meetings is a new domain of information retrieval that has received increasing attention in the past few years. Search in spontaneous spoken conversations has been recognized as more difficult than text-based document retrieval because meeting discussions contain two levels of information: the content itself, i.e. what topics are discussed, but also the argumentation process, i.e. what conflicts are resolved and what decisions are made. To capture the richness of information in meetings, current research focuses on recording meetings in Smart-Rooms, transcribing meeting discussion into text and annotating discussion with semantic higher-level structures to allow for efficient access to the data. However, it is not yet clear what type of user interface is best suited for searching and browsing such archived, annotated meetings. Content-based retrieval with keyword search is too naive and does not take into account the semantic annotations on the data. The objective of this thesis is to assess the feasibility and usefulness of a natural language interface to meeting archives that allows users to ask complex questions about meetings and retrieve episodes of meeting discussions based on semantic annotations. The particular issues that we address are: the need of argumentative annotation to answer questions about meetings; the linguistic and domain-specific natural language understanding techniques required to interpret such questions; and the use of visual overviews of meeting annotations to guide users in formulating questions. To meet the outlined objectives, we have annotated meetings with argumentative structure and built a prototype of a natural language understanding engine that interprets questions based on those annotations. Further, we have performed two sets of user experiments to study what questions users ask when faced with a natural language interface to annotated meeting archives. For this, we used a simulation method called Wizard of Oz, to enable users to express questions in their own terms without being influenced by limitations in speech recognition technology. Our experimental results show that technically it is feasible to annotate meetings and implement a deep-linguistic NLU engine for questions about meetings, but in practice users do not consistently take advantage of these features. Instead they often search for keywords in meetings. When visual overviews of the available annotations are provided, users refer to those annotations in their questions, but the complexity of questions remains simple. Users search with a breadth-first approach, asking questions in sequence instead of a single complex question. We conclude that natural language interfaces to meeting archives are useful, but that more experimental work is needed to find ways to incent users to take advantage of the expressive power of natural language when asking questions about meetings.
Devis Tuia, Sylvain Lobry, Christel Marie Tartini-Chappuis, Javiera Francisca Castillo Navarro, Nicola Antonio Santacroce