Publication

How Comparable are Parallel Corpora? Measuring the Distribution of General Vocabulary and Connectives

Andrei Popescu-Belis, Thomas Meyer
2011
Conference paper

Abstract

In this paper, we question the homogeneity of a large parallel corpus by measuring the similarity between various sub-parts. We compare results obtained using a general measure of lexical similarity based on c2 and by counting the number of discourse connectives. We argue that discourse connectives provide a more sensitive measure, revealing differences that are not visible with the general measure. We also provide evidence for the existence of specific characteristics defining translated texts as opposed to nontranslated ones, due to a universal tendency for explicitation.

Official source

https://infoscience.epfl.ch/record/167407?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Multiplicative chaos of the Brownian loop soup

Antoine Pierre François Jego, Titus Lupu

We construct a measure on the thick points of a Brownian loop soup in a bounded domain D

D

of the plane with given intensity theta>0

\theta >0

, which is formally obtained by exponentiating the square root of its occupation field. The measure is construct ...

WILEY2023

How Comparable are Parallel Corpora? Measuring the Distribution of General Vocabulary and Connectives

Graph Chatbot

Chat with Graph Search

Stability of the Faber-Krahn inequality for the short-time Fourier transform

Multiplicative chaos of the Brownian loop soup

A new spin on fidgets

Multiplicative chaos of the Brownian loop soup

Stability of the Faber-Krahn inequality for the short-time Fourier transform

A new spin on fidgets