Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
This paper investigates an approach that maximizes the joint posterior probabil ity of the pronounced word and the speaker identity given the observed data. This probability can be expressed as a product of the posterior probability of the pronounced word ...
In many signal such speech, bio-signals, protein chains, etc. there is a dependency between consecutive vectors. As the dependency is limited in duration such data can be called as Piecewise-Dependent- Data (PDD). In clustering it is frequently needed to m ...
Methods to improve noise robustness of speech recognition systems often result in degradation of recognition performance for clean speech. Recently proposed Phase AutoCorrelation (PAC) \cite{ikbal03,ikbal03a} based features, showing noticeable improvement ...
This paper investigates the use of multiple pronunciations modeling for User-Customized Password Speaker Verification (UCP-SV). The main characteristic of the UCP-SV is that the system does not have any {\it a priori} knowledge about the password used by t ...
Accessing, organizing, and manipulating home videos present technical challenges due to their unrestricted content and lack of storyline. In this paper, we present a methodology to discover cluster structure in home videos, which uses video shots as the un ...
This paper presents overview of an online audio indexing system, which creates a searchable index of speech content embedded in digitized audio files. This system is based on our recently proposed offline audio segmentation techniques. As the data arrives ...
Most commonly used criteria for speaker change detection like log likelihood ratio (LLR) and Bayesian information criterion (BIC) have an adjustathreshold/penalty parameter to make speaker change decisions. These parameters robust to different acoustic con ...
Accessing, organizing, and manipulating home videos present technical challenges due to their unrestricted content and lack of storyline. In this paper, we present a methodology to discover cluster structure in home videos, which uses video shots as the un ...
In this paper we present a new approach towards high performance speech/music segmentation on realistic tasks related to the automatic transcription of broadcast news. In the approach presented here, the local probability density function (PDF) estimators ...
A new approach is presented for clustering the speakers from unlabeled and unsegmented conversation, when the number of speakers is unknown. In this approach, each speaker is modeled by a Self- Organizing-Map (SOM). For estimation of the number of clusters ...