A vocoder (ˈvoʊkoʊdər, a portmanteau of voice and encoder) is a category of speech coding that analyzes and synthesizes the human voice signal for audio data compression, multiplexing, voice encryption or voice transformation.
The vocoder was invented in 1938 by Homer Dudley at Bell Labs as a means of synthesizing human speech. This work was developed into the channel vocoder which was used as a voice codec for telecommunications for speech coding to conserve bandwidth in transmission.
By encrypting the control signals, voice transmission can be secured against interception. Its primary use in this fashion is for secure radio communication. The advantage of this method of encryption is that none of the original signal is sent, only envelopes of the bandpass filters. The receiving unit needs to be set up in the same filter configuration to re-synthesize a version of the original signal spectrum.
The vocoder has also been used extensively as an electronic musical instrument. The decoder portion of the vocoder, called a voder, can be used independently for speech synthesis.
The human voice consists of sounds generated by the opening and closing of the glottis by the vocal cords, which produces a periodic waveform with many harmonics. This basic sound is then filtered by the nose and throat (a complicated resonant piping system) to produce differences in harmonic content (formants) in a controlled way, creating the wide variety of sounds used in speech. There is another set of sounds, known as the unvoiced and plosive sounds, which are created or modified by the mouth in different fashions.
The vocoder examines speech by measuring how its spectral characteristics change over time. This results in a series of signals representing these frequencies at any particular time as the user speaks. In simple terms, the signal is split into a number of frequency bands (the larger this number, the more accurate the analysis) and the level of signal present at each frequency band gives the instantaneous representation of the spectral energy content.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
The goal of this course is to introduce the engineering students state-of-the-art speech and audio coding techniques with an emphasis on the integration of knowledge about sound production and auditor
Mettre en pratique les bases de la programmation vues au semestre précédent. Développer un logiciel structuré. Méthode de debug d'un logiciel. Introduction à la programmation scientifique. Introductio
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition. Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database.
Synth-pop (short for synthesizer pop; also called techno-pop) is a genre of new wave and pop music that first became prominent in the late 1970s and features the synthesizer as the dominant musical instrument. It was prefigured in the 1960s and early 1970s by the use of synthesizers in progressive rock, electronic, art rock, disco, and particularly the Krautrock of bands like Kraftwerk. It arose as a distinct genre in Japan and the United Kingdom in the post-punk era as part of the new wave movement of the late 1970s to the mid-1980s.
Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. LPC is the most widely used method in speech coding and speech synthesis. It is a powerful speech analysis technique, and a useful method for encoding good quality speech at a low bit rate.
Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases tha ...
Generating melody from lyrics to compose a song has been a very interesting research topic in the area of artificial intelligence and music, which tries to predict generative music relationship between lyrics and melody. In this demonstration paper, by exp ...
The respiratory system is an integral part of human speech production. As a consequence, there is a close relation between respiration and speech signal, and the produced speech signal carries breathing pattern related information. Speech can also be gener ...