Large-scale models for learning fixed-dimensional cross-lingual sentence representations like LASER (Artetxe and Schwenk, 2019b) lead to significant improvement in performance on downstream tasks. However, further increases and modifications based on such ...
Most of the Natural Language Processing (NLP) algorithms involve use of distributed vector representations of linguistic units (primarily words and sentences) also known as embeddings in one way or another. These embeddings come in two flavours namely, sta ...
Subword modeling for zero-resource languages aims to learn low-level representations of speech audio without using transcriptions or other resources from the target language (such as text corpora or pronunciation dictionaries). A good representation should ...
The extremely high recognition accuracy achieved by modern, convolutional neural network (CNN) based face recognition (FR) systems has contributed significantly to the adoption of such systems in a variety of applications, from mundane activities like unlo ...
Current state-of-the-art models for sentiment analysis make use of word order either explicitly by pre-training on a language modeling objective or implicitly by using recurrent neural networks (RNNS) or convolutional networks (CNNS). This is a problem for ...
Different senses of source words must often be rendered by different words in the target language when performing machine translation (MT). Selecting the correct translation of polysemous words can be done based on the contexts of use. However, state-of-th ...
Keyphrase extraction is the task of automatically selecting a small set of phrases that best describe a given free text document. Keyphrases can be used for indexing, searching, aggregating and summarizing text documents, serving many automatic as well as ...
We propose a novel, semi-supervised approach towards domain taxonomy induction from an input vocabulary of seed terms. Unlike all previous approaches, which typically extract direct hypernym edges for terms, our approach utilizes a novel probabilistic fram ...
Machine-readable semantic knowledge in the form of taxonomies (i.e., a collection of is-a edges) has proved to be beneficial in an array of NLP tasks including inference, textual entailment, question answering and information extraction. Such widespread ut ...
The notion of similarity between texts is fundamental for many applications of Natural Language Processing. For example, this notion is particularly useful for the applications designed for the management of information in large textual databases, such as ...