Learning computationally efficient static word and sentence representations

Prakhar Gupta
2021
Thèse EPFL

Résumé

Most of the Natural Language Processing (NLP) algorithms involve use of distributed vector representations of linguistic units (primarily words and sentences) also known as embeddings in one way or another. These embeddings come in two flavours namely, static/non-contextual and contextual. In a static embedding, the vector representation of a word is independent of its context as opposed to a contextual embedding where the word representation incorporates additional information from its surrounding context.Recently, advancements in deep learning when applied to contextual embeddings have seen them outperforming their static counterparts. However, this improvement in performance with respect to that of the static embeddings has come at the cost of lesser computational efficiency in terms of both computational resources as well as training and inference times, relative lack of interpretability, and higher costs to the environment. Consequently, static embedding models despite not being as expressive and powerful as contextual embedding models continue to be of relevance in Natural Language Processing Research.In this thesis, we propose improvements to the current state-of-the-art static word embedding and sentence embedding models in three different settings. Firstly, we propose an improved algorithm to learn word and sentence embedding from raw text by proposing changes to the Word2Vec training objective formulation and adding n-grams to the training to incorporate local contextual information. Consequently, we end up obtaining improved unsupervised static word and sentence embeddings. Our second major contribution is learning cross-lingual static word and sentence representations from parallel bilingual data where two corpora are aligned sentence-wise. Our word and sentence embeddings thus obtained outperform other bag-of-words bilingual embeddings on cross-lingual sentence retrieval and monolingual word similarity tasks while staying competitive with them on cross-lingual word translation tasks. In our last major contribution, we aim towards harnessing the expressive power of the contextual embedding models by distilling static word embeddings from contextual embedding models to use improved word representations for computationally light tasks. This allows us to utilize the semantic information possessed by the contextual embedding models while maintaining computational efficiency for inference tasks at the same time.

Source officielle

https://infoscience.epfl.ch/record/290166?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Learning computationally efficient static word and sentence representations

Graph Chatbot

Chattez avec Graph Search

Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning

Text Representation Learning for Low Cost Natural Language Understanding

Modeling Structured Data in Attention-based Models

Modeling Structured Data in Attention-based Models

Text Representation Learning for Low Cost Natural Language Understanding

Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning