Phonetics is a branch of linguistics that studies how humans produce and perceive sounds, or in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians. The field of phonetics is traditionally divided into three sub-disciplines based on the research questions involved such as how humans plan and execute movements to produce speech (articulatory phonetics), how various movements affect the properties of the resulting sound (acoustic phonetics), or how humans convert sound waves to linguistic information (auditory phonetics). Traditionally, the minimal linguistic unit of phonetics is the phone—a speech sound in a language which differs from the phonological unit of phoneme; the phoneme is an abstract categorization of phones, and it is also defined as the smallest unit that discerns meaning between sounds in any given language.
Phonetics deals with two aspects of human speech: production—the ways humans make sounds—and perception—the way speech is understood. The communicative modality of a language describes the method by which a language produces and perceives languages. Languages with oral-aural modalities such as English produce speech orally (using the mouth) and perceive speech aurally (using the ears). Sign languages, such as Australian Sign Language (Auslan) and American Sign Language (ASL), have a manual-visual modality, producing speech manually (using the hands) and perceiving speech visually (using the eyes). ASL and some other sign languages have in addition a manual-manual dialect for use in tactile signing by deafblind speakers where signs are produced with the hands and perceived with the hands as well.
Language production consists of several interdependent processes which transform a non-linguistic message into a spoken or signed linguistic signal. After identifying a message to be linguistically encoded, a speaker must select the individual words—known as lexical items—to represent that message in a process called lexical selection.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
In speech science and phonetics, a formant is the broad spectral maximum that results from an acoustic resonance of the human vocal tract. In acoustics, a formant is usually defined as a broad peak, or local maximum, in the spectrum. For harmonic sounds, with this definition, the formant frequency is sometimes taken as that of the harmonic that is most augmented by a resonance. The difference between these two definitions resides in whether "formants" characterise the production mechanisms of a sound or the produced sound itself.
In articulatory phonetics, the manner of articulation is the configuration and interaction of the articulators (speech organs such as the tongue, lips, and palate) when making a speech sound. One parameter of manner is stricture, that is, how closely the speech organs approach one another. Others include those involved in the r-like sounds (taps and trills), and the sibilancy of fricatives.
In phonetics and linguistics, a phone is any distinct speech sound or gesture, regardless of whether the exact sound is critical to the meanings of words. In contrast, a phoneme is a speech sound in a given language that, if swapped with another phoneme, could change one word to another. Phones are absolute and are not specific to any language, but phonemes can be discussed only in reference to specific languages. For example, the English words kid and kit end with two distinct phonemes, /d/ and /t/, and swapping one for the other would change one word into a different word.
In the literature, the task of dysarthric speech intelligibility assessment has been approached through development of different low-level feature representations, subspace modeling, phone confidence estimation or measurement of automatic speech recognitio ...
Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases tha ...
EPFL2023
,
Voice activity detection (VAD) is an important pre-processing step for speech technology applications. The task consists of deriving segment boundaries of audio signals which contain voicing information. In recent years, it has been shown that voice source ...