Publication

Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Andrei Popescu-Belis, Majid Yazdani
2013
Journal paper

Abstract

We propose a method for computing semantic relatedness between words or texts by using knowledge from hypertext encyclopedias such as Wikipedia. A network of concepts is built by filtering the encyclopedia's articles, each concept corresponding to an article. Two types of weighted links between concepts are considered: one based on hyperlinks between the texts of the articles, and another one based on the lexical similarity between them. We propose and implement an efficient random walk algorithm that computes the distance between nodes, and then between sets of nodes, using the visiting probability from one (set of) node(s) to another. Moreover, to make the algorithm tractable, we propose and validate empirically two truncation methods, and then use an embedding space to learn an approximation of visiting probability. To evaluate the proposed distance, we apply our method to four important tasks in natural language processing: word similarity, document similarity, document clustering and classification, and ranking in information retrieval. The performance of the method is state-of-the-art or close to it for each task, thus demonstrating the generality of the knowledge resource. Moreover, using both hyperlinks and lexical similarity links improves the scores with respect to a method using only one of them, because hyperlinks bring additional real-world knowledge not captured by lexical similarity. (C) 2012 Elsevier B.V. All rights reserved.

Official source

https://infoscience.epfl.ch/record/192707?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Andrei Popescu-Belis, Majid Yazdani
2013
Journal paper

Abstract

Official source

https://infoscience.epfl.ch/record/192707?ln=en

About this result

Ontological neighbourhood

Information engineering

Natural language processing: Topics in natural language processing

Related concepts (32)

Related publications (57)

Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Graph Chatbot

Chat with Graph Search

Examining European Press Coverage of the Covid-19 No-Vax Movement: An NLP Framework

Horizontal Healing

On the Maximum Power Density of Implanted Antennas within Simplified Body Phantoms

Horizontal Healing

Examining European Press Coverage of the Covid-19 No-Vax Movement: An NLP Framework

On the Maximum Power Density of Implanted Antennas within Simplified Body Phantoms