**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Compression of 3D models with NURBS

Résumé

With recent progress in computing, algorithmics and telecommunications, 3D models are increasingly used in various multimedia applications. Examples include visualization, gaming, entertainment and virtual reality. In the multimedia domain 3D models have been traditionally represented as polygonal meshes. This piecewise planar representation can be thought of as the analogy of bitmap images for 3D surfaces. As bitmap images, they enjoy great flexibility and are particularly well suited to describing information captured from the real world, through, for instance, scanning processes. They suffer, however, from the same shortcomings, namely limited resolution and large storage size. The compression of polygonal meshes has been a very active field of research in the last decade and rather efficient compression algorithms have been proposed in the literature that greatly mitigate the high storage costs. However, such a low level description of a 3D shape has a bounded performance. More efficient compression should be reachable through the use of higher level primitives. This idea has been explored to a great extent in the context of model based coding of visual information. In such an approach, when compressing the visual information a higher level representation (e.g., 3D model of a talking head) is obtained through analysis methods. This can be seen as an inverse projection problem. Once this task is fullled, the resulting parameters of the model are coded instead of the original information. It is believed that if the analysis module is efficient enough, the total cost of coding (in a rate distortion sense) will be greatly reduced. The relatively poor performance and high complexity of currently available analysis methods (except for specific cases where a priori knowledge about the nature of the objects is available), has refrained a large deployment of coding techniques based on such an approach. Progress in computer graphics has however changed this situation. In fact, nowadays, an increasing number of pictures, video and 3D content are generated by synthesis processing rather than coming from a capture device such as a camera or a scanner. This means that the underlying model in the synthesis stage can be used for their efficient coding without the need for a complex analysis module. In other words it would be a mistake to attempt to compress a low level description (e.g., a polygonal mesh) when a higher level one is available from the synthesis process (e.g., a parametric surface). This is, however, what is usually done in the multimedia domain, where higher level 3D model descriptions are converted to polygonal meshes, if anything by the lack of standard coded formats for the former. On a parallel but related path, the way we consume audio-visual information is changing. As opposed to recent past and a large part of today's applications, interactivity is becoming a key element in the way we consume information. In the context of interest in this dissertation, this means that when coding visual information (an image or a video for instance), previously obvious considerations such as decision on sampling parameters are not so obvious anymore. In fact, as in an interactive environment the effective display resolution can be controlled by the user through zooming, there is no clear optimal setting for the sampling period. This means that because of interactivity, the representation used to code the scene should allow the display of objects in a variety of resolutions, and ideally up to infinity. One way to resolve this problem would be by extensive over-sampling. But this approach is unrealistic and too expensive to implement in many situations. The alternative would be to use a resolution independent representation. In the realm of 3D modeling, such representations are usually available when the models are created by an artist on a computer. The scope of this dissertation is precisely the compression of 3D models in higher level forms. The direct coding in such a form should yield improved rate-distortion performance while providing a large degree of resolution independence. There has not been, so far, any major attempt to efficiently compress these representations, such as parametric surfaces. This thesis proposes a solution to overcome this gap. A variety of higher level 3D representations exist, of which parametric surfaces are a popular choice among designers. Within parametric surfaces, Non-Uniform Rational B-Splines (NURBS) enjoy great popularity as a wide range of NURBS based modeling tools are readily available. Recently, NURBS has been included in the Virtual Reality Modeling Language (VRML) and its next generation descendant eXtensible 3D (X3D). The nice properties of NURBS and their widespread use has lead us to choose them as the form we use for the coded representation. The primary goal of this dissertation is the definition of a system for coding 3D NURBS models with guaranteed distortion. The basis of the system is entropy coded differential pulse coded modulation (DPCM). In the case of NURBS, guaranteeing the distortion is not trivial, as some of its parameters (e.g., knots) have a complicated influence on the overall surface distortion. To this end, a detailed distortion analysis is performed. In particular, previously unknown relations between the distortion of knots and the resulting surface distortion are demonstrated. Compression efficiency is pursued at every stage and simple yet efficient entropy coder realizations are defined. The special case of degenerate and closed surfaces with duplicate control points is addressed and an efficient yet simple coding is proposed to compress the duplicate relationships. Encoder aspects are also analyzed. Optimal predictors are found that perform well across a wide class of models. Simplification techniques are also considered for improved compression efficiency at negligible distortion cost. Transmission over error prone channels is also considered and an error resilient extension defined. The data stream is partitioned by independently coding small groups of surfaces and inserting the necessary resynchronization markers. Simple strategies for achieving the desired level of protection are proposed. The same extension also serves the purpose of random access and on-the-fly reordering of the data stream.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (26)

Parameter

A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element

Infographie

L'infographie est le domaine de la création d' assistée par ordinateur. Cette activité est liée aux arts graphiques. Les études les plus courantes passent par les écoles publiques ou privées se situ

NURBS

Les B-splines rationnelles non uniformes, plus communément désignées par leur acronyme anglais NURBS (pour Non-Uniform Rational Basis Splines), correspondent à une généralisation des B-splines car ce

Publications associées (12)

Chargement

Chargement

Chargement

The thesis studies the optimization of a specific type of computer graphic representation: polygon-based, textured models. More precisely, we focus on meshes having 4-8 connectivity. We study a progressive and adaptive representation for textured 4-8 meshes suitable for transmission. Our results are valid for 4-8 meshes built from matrices of amplitudes, or given as approximations of a subdivision surface. In the latter case, the models can have arbitrary topology. In order to clarify our goals, we first describe a transmission system for computer graphics models (Chapter 1). Then, we review approximation techniques (Chapter 2) and study the computational properties of 4-8 meshes (Chapter 3). We provide an efficient method to store and access our dataset (Chapter 4). We address the problem of 4-8 mesh simplification and give an efficient θ(n log n) algorithm to compute progressive and adaptive representations of 4-8 meshes using global error (Chapter 5). We study the joint optimization of mesh and texture (Chapter 6). Finally, we conclude and give future research directions (Chapter 7).

Computer Graphics came into the medical world especially after the arrival of 3D medical imaging. Computer Graphics techniques are already integrated in the diagnosis procedure by means of the visual tridimensional analysis of computer tomography, magnetic resonance and even ultrasound data. The representations they provide, nevertheless, are static pictures of the patients' body, lacking in functional information. We believe that the next step in computer assisted diagnosis and surgery planning depends on the development of functional 3D models of human body. It is in this context that we propose a model of articulations based on biomechanics. Such model is able to simulate the joint functionality in order to allow for a number of medical applications. It was developed focusing on the following requirements: it must be at the same time simple enough to be implemented on computer, and realistic enough to allow for medical applications; it must be visual in order for applications to be able to explore the joint in a 3D simulation environment. Then, we propose to combine kinematical motion for the parts that can be considered as rigid, such as bones, and physical simulation of the soft tissues. We also deal with the interaction between the different elements of the joint, and for that we propose a specific contact management model. Our kinematical skeleton is based on anatomy. Special considerations have been taken to include anatomical features like axis displacements, range of motion control, and joints coupling. Once a 3D model of the skeleton is built, it can be simulated by data coming from motion capture or can be specified by a specialist, a clinician for instance. Our deformation model is an extension of the classical mass-spring systems. A spherical volume is considered around mass points, and mechanical properties of real materials can be used to parameterize the model. Viscoelasticity, anisotropy and non-linearity of the tissues are simulated. We particularly proposed a method to configure the mass-spring matrix such that the objects behave according to a predefined Young's modulus. A contact management model is also proposed to deal with the geometric interactions between the elements inside the joint. After having tested several approaches, we proposed a new method for collision detection which measures in constant time the signed distance to the closest point for each point of two meshes subject to collide. We also proposed a method for collision response which acts directly on the surfaces geometry, in a way that the physical behavior relies on the propagation of reaction forces produced inside the tissue. Finally, we proposed a 3D model of a joint combining the three elements: anatomical skeleton motion, biomechanical soft tissues deformation, and contact management. On the top of that we built a virtual hip joint and implemented a set of medical applications prototypes. Such applications allow for assessment of stress distribution on the articular surfaces, range of motion estimation based on ligament constraint, ligament elasticity estimation from clinically measured range of motion, and pre- and post-operative evaluation of stress distribution. Although our model provides physicians with a number of useful variables for diagnosis and surgery planning, it should be improved for effective clinical use. Validation has been done partially. However, a global clinical validation is necessary. Patient specific data are still difficult to obtain, especially individualized mechanical properties of tissues. The characterization of material properties in our soft tissues model can also be improved by including control over the shear modulus.

Musical and audio signals in general form a major part of the large amount of data exchange taking place in our information-based society. Transmission of high quality audio signals through narrow-band channels, such as the Internet, requires refined methods for modeling and coding sound. The first important step is the development of new analysis techniques able to discriminate between sound components according to effective perceptual criteria. Our ultimate goal is to develop an optimal representation in a psychoacoustical sense, providing minimum rate and minimum "perceptual distortion" at the same time. One of the most challenging aspects of this task is the definition of a good model for the representation of the different components of sound. Musical and speech signals contain both deterministic and stochastic components. In voiced sounds the deterministic part provides the pitch and the global timbre: it is in a sense the fundamental structure of a sound and can be easily represented by means of a very restricted set of parameters. The stochastic part contains what we might call the "life of a sound", that is an ensemble of microfluctuations with respect to an electronic-like/non-evolving sound as well as noise due to the physical excitation system. The reproduction of the latter is of fundamental importance to perceive a sound like a natural one. We faced this challenge by developing a new sound analysis/synthesis method called Fractal Additive Synthesis (FAS). The first step was the definition of a new class of wavelet transforms, namely the Harmonic-Band Wavelet Transform (HBWT). This transform is based on a cascade of Modified Discrete Cosine Transform (MDCT) and Wavelet Transforms (WT). By means of the HBWT, we are able to separate the stochastic from the deterministic components of sound and to treat them separately. The second step was the definition of a model for the stochastic components. The spectra of voiced musical sound have non-zero energy in the sidebands of the spectral peaks. These sidebands contain information relative to the stochastic components. The effect of these components is that the waveform of what we call a pseudo-periodic signal, i.e. the stationary part of voiced sounds, changes slightly from period to period. Our work is based on the experimentally verified assumption that the energy distribution of a sideband of a voiced sound spectrum is mostly shaped like powers of the inverse of the distance from the closest partial. The power spectrum of these pseudo-periodic processes is then modeled by means of a superposition of modulated 1/f components, i.e., by means of what we define as a pseudo-periodic 1/f-like process. The time-scale character of the wavelet transform is well adapted to the selfsimilar behavior of 1/f processes. The wavelet analysis of 1/f noise yields a set of very loosely correlated coefficients that in first approximation can be well modeled by white noise in the synthesis. The fractal properties of the 1/f noise also motivated our choice of the name Fractal Additive Synthesis. The next step was the definition of a model for the deterministic components of voiced sounds, consistent with the HBWT analysis/synthesis method. The model is from some point of view inspired by the sinusoidal models. The two models provide a complete method for the analysis and resynthesis of voiced sounds in the perspective of structured audio (SA) sound representations. For the stationary part of voiced sounds compression, ratios in the range of 10-15:1 are easily achievable. Even better results in terms of data compression can be obtained by taking psychoacoustic criteria into consideration. A psychoacoustic based selection of perceptually relevant parameters was implemented and tested. Compression ratios of 20-30:1, depending on the musical instrument, were achieved. An extension of the method based on a pitch-synchronous version of the HBWT with perfect reconstruction time-varying cosine-modulated filter banks was also studied. This makes the method able to handle, for instance, the slight pitch deviations or the vibrato of a musical tone or more relevant changes of pitch as in a glissando. Finally, the method has been successfully extended to non-harmonic sounds by the introduction and definition of an optimization procedure for the design of non-perfect reconstruction cosine-modulated filter banks with inharmonic band subdivisions. These extensions make FAS more flexible and suitable to analyze, encode, process and resynthesize a large class of musical sounds. The final result of this work is the development of a method for modeling in a flexible way both the stochastic and the deterministic parts of sounds at a very refined perceptual level and with a minimum amount of parameters controlling the synthesis process. In the context of SA the method provides a sound analysis/synthesis tool able to encode and to resynthesize sounds at low rate, while maintaining their natural timbre dynamics for high quality reproduction.