Transform codingTransform coding is a type of data compression for "natural" data like audio signals or photographic s. The transformation is typically lossless (perfectly reversible) on its own but is used to enable better (more targeted) quantization, which then results in a lower quality copy of the original input (lossy compression). In transform coding, knowledge of the application is used to choose information to discard, thereby lowering its bandwidth. The remaining information can then be compressed via a variety of methods.
Modified discrete cosine transformThe modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to the energy-compaction qualities of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid artifacts stemming from the block boundaries.
Data compression ratioData compression ratio, also known as compression power, is a measurement of the relative reduction in size of data representation produced by a data compression algorithm. It is typically expressed as the division of uncompressed size by compressed size. Data compression ratio is defined as the ratio between the uncompressed size and compressed size: Thus, a representation that compresses a file's storage size from 10 MB to 2 MB has a compression ratio of 10/2 = 5, often notated as an explicit ratio, 5:1 (read "five" to "one"), or as an implicit ratio, 5/1.
Transparency (data compression)In data compression and psychoacoustics, transparency is the result of lossy data compression accurate enough that the compressed result is perceptually indistinguishable from the uncompressed input, i.e. perceptually lossless. A transparency threshold is a given value at which transparency is reached. It is commonly used to describe compressed data bitrates. For example, the transparency threshold for MP3 to linear PCM audio is said to be between 175 and 245 kbit/s, at 44.
Shannon's source coding theoremIn information theory, Shannon's source coding theorem (or noiseless coding theorem) establishes the limits to possible data compression, and the operational meaning of the Shannon entropy. Named after Claude Shannon, the source coding theorem shows that (in the limit, as the length of a stream of independent and identically-distributed random variable (i.i.d.) data tends to infinity) it is impossible to compress the data such that the code rate (average number of bits per symbol) is less than the Shannon entropy of the source, without it being virtually certain that information will be lost.
Generation lossGeneration loss is the loss of quality between subsequent copies or transcodes of data. Anything that reduces the quality of the representation when copying, and would cause further reduction in quality on making a copy of the copy, can be considered a form of generation loss. File size increases are a common result of generation loss, as the introduction of artifacts may actually increase the entropy of the data through each generation.
Rate–distortion theoryRate–distortion theory is a major branch of information theory which provides the theoretical foundations for lossy data compression; it addresses the problem of determining the minimal number of bits per symbol, as measured by the rate R, that should be communicated over a channel, so that the source (input signal) can be approximately reconstructed at the receiver (output signal) without exceeding an expected distortion D. Rate–distortion theory gives an analytical expression for how much compression can be achieved using lossy compression methods.
Discrete wavelet transformIn numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location information (location in time). Haar wavelet The first DWT was invented by Hungarian mathematician Alfréd Haar. For an input represented by a list of numbers, the Haar wavelet transform may be considered to pair up input values, storing the difference and passing the sum.
ITU-TThe International Telecommunication Union Telecommunication Standardization Sector (ITU-T) is one of the three Sectors (branches) of the International Telecommunication Union (ITU). It is responsible for coordinating standards for telecommunications and Information Communication Technology, such as X.509 for cybersecurity, Y.3172 and Y.3173 for machine learning, and H.264/MPEG-4 AVC for video compression, between its Member States, Private Sector Members, and Academia Members.
Variable bitrateVariable bitrate (VBR) is a term used in telecommunications and computing that relates to the bitrate used in sound or video encoding. As opposed to constant bitrate (CBR), VBR files vary the amount of output data per time segment. VBR allows a higher bitrate (and therefore more storage space) to be allocated to the more complex segments of media files while less space is allocated to less complex segments. The average of these rates can be calculated to produce an average bitrate for the file.