**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Concept# Rate–distortion theory

Résumé

Rate–distortion theory is a major branch of information theory which provides the theoretical foundations for lossy data compression; it addresses the problem of determining the minimal number of bits per symbol, as measured by the rate R, that should be communicated over a channel, so that the source (input signal) can be approximately reconstructed at the receiver (output signal) without exceeding an expected distortion D.
Introduction
Rate–distortion theory gives an analytical expression for how much compression can be achieved using lossy compression methods. Many of the existing audio, speech, image, and video compression techniques have transforms, quantization, and bit-rate allocation procedures that capitalize on the general shape of rate–distortion functions.
Rate–distortion theory was created by Claude Shannon in his foundational work on information theory.
In rate–distortion theory, the rate is usually understood as the number of bits per data sample to be st

Source officielle

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Publications associées

Chargement

Personnes associées

Chargement

Unités associées

Chargement

Concepts associés

Chargement

Cours associés

Chargement

Séances de cours associées

Chargement

Personnes associées (21)

Concepts associés (10)

Publications associées (100)

Compression de données

La compression de données ou codage de source est l'opération informatique consistant à transformer une suite de bits A en une suite de bits B plus courte pouvant restituer les mêmes informations, ou

Lossy compression

In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the conten

Théorie de l'information

La théorie de l'information, sans précision, est le nom usuel désignant la théorie de l'information de Shannon, qui est une théorie utilisant les probabilités pour quantifier le contenu moyen en inf

Chargement

Chargement

Chargement

Unités associées (8)

Cours associés (8)

COM-404: Information theory and coding

The mathematical principles of communication that govern the compression and transmission of data and the design of efficient methods of doing so.

EE-550: Image and video processing

This course covers fundamental notions in image and video processing, as well as covers most popular tools used, such as edge detection, motion estimation, segmentation, and compression. It is composed of lectures, laboratory sessions, and mini-projects.

MICRO-512: Image processing II

Study of advanced image processing; mathematical imaging. Development of image-processing software and prototyping in JAVA; application to real-world examples in industrial vision and biomedical imaging.

With recent progress in computing, algorithmics and telecommunications, 3D models are increasingly used in various multimedia applications. Examples include visualization, gaming, entertainment and virtual reality. In the multimedia domain 3D models have been traditionally represented as polygonal meshes. This piecewise planar representation can be thought of as the analogy of bitmap images for 3D surfaces. As bitmap images, they enjoy great flexibility and are particularly well suited to describing information captured from the real world, through, for instance, scanning processes. They suffer, however, from the same shortcomings, namely limited resolution and large storage size. The compression of polygonal meshes has been a very active field of research in the last decade and rather efficient compression algorithms have been proposed in the literature that greatly mitigate the high storage costs. However, such a low level description of a 3D shape has a bounded performance. More efficient compression should be reachable through the use of higher level primitives. This idea has been explored to a great extent in the context of model based coding of visual information. In such an approach, when compressing the visual information a higher level representation (e.g., 3D model of a talking head) is obtained through analysis methods. This can be seen as an inverse projection problem. Once this task is fullled, the resulting parameters of the model are coded instead of the original information. It is believed that if the analysis module is efficient enough, the total cost of coding (in a rate distortion sense) will be greatly reduced. The relatively poor performance and high complexity of currently available analysis methods (except for specific cases where a priori knowledge about the nature of the objects is available), has refrained a large deployment of coding techniques based on such an approach. Progress in computer graphics has however changed this situation. In fact, nowadays, an increasing number of pictures, video and 3D content are generated by synthesis processing rather than coming from a capture device such as a camera or a scanner. This means that the underlying model in the synthesis stage can be used for their efficient coding without the need for a complex analysis module. In other words it would be a mistake to attempt to compress a low level description (e.g., a polygonal mesh) when a higher level one is available from the synthesis process (e.g., a parametric surface). This is, however, what is usually done in the multimedia domain, where higher level 3D model descriptions are converted to polygonal meshes, if anything by the lack of standard coded formats for the former. On a parallel but related path, the way we consume audio-visual information is changing. As opposed to recent past and a large part of today's applications, interactivity is becoming a key element in the way we consume information. In the context of interest in this dissertation, this means that when coding visual information (an image or a video for instance), previously obvious considerations such as decision on sampling parameters are not so obvious anymore. In fact, as in an interactive environment the effective display resolution can be controlled by the user through zooming, there is no clear optimal setting for the sampling period. This means that because of interactivity, the representation used to code the scene should allow the display of objects in a variety of resolutions, and ideally up to infinity. One way to resolve this problem would be by extensive over-sampling. But this approach is unrealistic and too expensive to implement in many situations. The alternative would be to use a resolution independent representation. In the realm of 3D modeling, such representations are usually available when the models are created by an artist on a computer. The scope of this dissertation is precisely the compression of 3D models in higher level forms. The direct coding in such a form should yield improved rate-distortion performance while providing a large degree of resolution independence. There has not been, so far, any major attempt to efficiently compress these representations, such as parametric surfaces. This thesis proposes a solution to overcome this gap. A variety of higher level 3D representations exist, of which parametric surfaces are a popular choice among designers. Within parametric surfaces, Non-Uniform Rational B-Splines (NURBS) enjoy great popularity as a wide range of NURBS based modeling tools are readily available. Recently, NURBS has been included in the Virtual Reality Modeling Language (VRML) and its next generation descendant eXtensible 3D (X3D). The nice properties of NURBS and their widespread use has lead us to choose them as the form we use for the coded representation. The primary goal of this dissertation is the definition of a system for coding 3D NURBS models with guaranteed distortion. The basis of the system is entropy coded differential pulse coded modulation (DPCM). In the case of NURBS, guaranteeing the distortion is not trivial, as some of its parameters (e.g., knots) have a complicated influence on the overall surface distortion. To this end, a detailed distortion analysis is performed. In particular, previously unknown relations between the distortion of knots and the resulting surface distortion are demonstrated. Compression efficiency is pursued at every stage and simple yet efficient entropy coder realizations are defined. The special case of degenerate and closed surfaces with duplicate control points is addressed and an efficient yet simple coding is proposed to compress the duplicate relationships. Encoder aspects are also analyzed. Optimal predictors are found that perform well across a wide class of models. Simplification techniques are also considered for improved compression efficiency at negligible distortion cost. Transmission over error prone channels is also considered and an error resilient extension defined. The data stream is partitioned by independently coding small groups of surfaces and inserting the necessary resynchronization markers. Simple strategies for achieving the desired level of protection are proposed. The same extension also serves the purpose of random access and on-the-fly reordering of the data stream.

Wireless sensor networks have emerged as versatile tools to monitor the evolution of physical fields over large areas by means of self-powered and low-cost sensing devices. Although their development was originally motivated by military applications, they are nowadays used in numerous civilian application areas. In many sensor network scenarios, including the ones studied in this thesis, the goal of data acquisition by means of sensor nodes is to reproduce at a base station the observed field under a fidelity constraint, using limited communication resources. Due to the scarcity of the energy resources, the physical nature of the observed fields, the constraints on spatial sampling, the high cost of inter-node communication and the reliance on a shared communication medium, wireless sensor networks raise challenging questions in the areas of multidimensional sampling, multiterminal source coding and network communications. In this thesis we address the issue of how to efficiently represent the data collected by the sensor nodes. This data has some inherent structure that is determined by the laws of physics. Since the design of efficient sampling geometries and source coding schemes requires a thorough understanding of the physical properties of the observed phenomenon, the first part of our work is devoted to setting up a mathematical model for physical fields that is amenable to the tools of information theory while applying to most physical phenomena of interest. The question regarding the sufficiency of a discrete representation of an analog field is addressed in two steps. First, we prove a multidimensional sampling theorem for homogeneous random fields with compactly supported spectral measures. While, under appropriate conditions, a discrete-space and -time representation of the analog field is sufficient, discretizing the field's amplitude, also referred to as source coding, implies an unavoidable loss of information, which may be expressed in terms of a rate distortion function. In the second step we study various source coding schemes, differing by the amount of required inter-node communication and the complexity of the involved maps. We derive lower and upper bounds for the rate distortion function of the multiterminal source coding scheme, which is of particular interest in sensor network engineering as it makes use of the spatio-temporal structure of the collected data without requiring inter-node communication. In the second part of the thesis we study some applications of sensor networks. In particular, we consider the acquisition of acoustic waves, the monitoring of temperature distributions and the compression of data sequences generated by random walks. For the setup of sound field acquisition, we show that, under the assumption of spectral whiteness of the sound field, the afore-mentioned bounds coincide, thus establishing the multiterminal rate distortion function for this setup.

In most watermarking systems, masking models, inherited from data compression algorithms, are used to preserve fidelity by controlling the perceived distortion resulting from adding the watermark to the original signal. So far, little attention has been paid to the consequences of using such models on a key design parameter: the robustness of the watermark to intentional attacks. The goal of this paper is to demonstrate that by considering fidelity alone, key information on the location and strength of the watermark may become available to an attacker; the latter can exploit such knowledge to build an effective mask attack. First, defining a theoretical framework in which analytical expressions for masking and watermarking are laid, a relation between the decrease of the detection statistic and the introduced perceptual distortion is found for the mask attack. The latter is compared to the Wiener filter attack. Then, considering masking models widely used in watermarking, experiments on both simulated and real data (audio and images) demonstrate how knowledge on the mask enables to greatly reduce the detection statistic, even for small perceptual distortion costs. The critical tradeoff between robustness and distortion is further discussed, and conclusions on the use of masking models in watermarking drawn.

2005Séances de cours associées (8)