Concept

Statistical machine translation

Statistical machine translation (SMT) was a machine translation approach, that superseded the previous, rule-based approach because it required explicit description of each and every linguistic rule, which was costly, and which often did not generalize to other languages. Since 2003, the statistical approach itself has been gradually superseded by the deep learning-based neural network approach. The first ideas of statistical machine translation were introduced by Warren Weaver in 1949, including the ideas of applying Claude Shannon's information theory. Statistical machine translation was re-introduced in the late 1980s and early 1990s by researchers at IBM's Thomas J. Watson Research Center The idea behind statistical machine translation comes from information theory. A document is translated according to the probability distribution that a string in the target language (for example, English) is the translation of a string in the source language (for example, French). The problem of modeling the probability distribution has been approached in a number of ways. One approach which lends itself well to computer implementation is to apply Bayes Theorem, that is , where the translation model is the probability that the source string is the translation of the target string, and the language model is the probability of seeing that target language string. This decomposition is attractive as it splits the problem into two subproblems. Finding the best translation is done by picking up the one that gives the highest probability: For a rigorous implementation of this one would have to perform an exhaustive search by going through all strings in the native language. Performing the search efficiently is the work of a machine translation decoder that uses the foreign string, heuristics and other methods to limit the search space and at the same time keeping acceptable quality. This trade-off between quality and time usage can also be found in speech recognition.

Official source

https://en.wikipedia.org/wiki/Statistical_machine_translation

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Statistical machine translation

Graph Chatbot

Chat with Graph Search

Regularization Techniques for Low-Resource Machine Translation

Monotonicity Reasoning in the Age of Neural Foundation Models

BMAT: An open-source BIDS managing and analysis tool

Regularization Techniques for Low-Resource Machine Translation

BMAT: An open-source BIDS managing and analysis tool

Monotonicity Reasoning in the Age of Neural Foundation Models