**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.

Publication# A BAYESIAN APPROACH TO INTER-TASK FUSION FOR SPEAKER RECOGNITION

Résumé

In i-vector based speaker recognition systems, back-end classifiers are trained to factor out nuisance information and retain only the speaker identity. As a result, variabilities arising due to gender, language and accent ( among many others) are suppressed. Inter-task fusion, in which such metadata information obtained from automatic systems is used, has been shown to improve speaker recognition performance. In this paper, we explore a Bayesian approach towards inter-task fusion. Speaker similarity score for a test recording is obtained by marginalizing the posterior probability of a speaker. Gender and language probabilities for the test audio are combined with speaker posteriors to obtain a final speaker score. The proposed approach is demonstrated for speaker verification and speaker identification tasks on the NIST SRE 2008 dataset. Relative improvements of up to 10% and 8% are obtained when fusing gender and language information, respectively.

Source officielle

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés (25)

Publications associées (44)

Bayesian probability

Bayesian probability (ˈbeɪziən or ˈbeɪʒən ) is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief. The Bayesian interpretation of probability can be seen as an extension of propositional logic that enables reasoning with hypotheses; that is, with propositions whose truth or falsity is unknown.

Speaker recognition

Speaker recognition is the identification of a person from characteristics of voices. It is used to answer the question "Who is speaking?" The term voice recognition can refer to speaker recognition or speech recognition. Speaker verification (also called speaker authentication) contrasts with identification, and speaker recognition differs from speaker diarisation (recognizing when the same speaker is speaking).

Inductive probability

Inductive probability attempts to give the probability of future events based on past events. It is the basis for inductive reasoning, and gives the mathematical basis for learning and the perception of patterns. It is a source of knowledge about the world. There are three sources of knowledge: inference, communication, and deduction. Communication relays information found using other methods. Deduction establishes new facts based on existing facts. Inference establishes new facts from data. Its basis is Bayes' theorem.

In i-vector based speaker recognition systems, back-end classifiers are trained to factor out nuisance information and retain only the speaker identity. As a result, variabilities arising due to gender, language and accent ( among many others) are suppress ...

Volkan Cevher, Paul Thierry Yves Rolland, Ali Kavis

We study the fundamental problem of learning an unknown, smooth probability function via pointwise Bernoulli tests. We provide a scalable algorithm for efficiently solving this problem with rigorous guarantees. In particular, we prove the convergence rate ...

2019Ian Smith, Alain Nussbaumer, Sai Ganesh Sarvotham Pai

Accurate measurement-data interpretation leads to increased understanding of structural behavior and enhanced asset-management decision making. In this paper, four data-interpretation methodologies, residual minimization, traditional Bayesian model updatin ...

2018