**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Towards image denoising in the latent space of learning-based compression

Résumé

In recent years, learning-based image compression has demonstrated similar or superior performance when com- pared to conventional approaches in terms of compression efficiency and visual quality. Typically, learning-based image compression takes advantage of autoencoders, which are architectures consisting of two main parts: a multi-layer neural network encoder and its dual decoder. The encoder maps the input image represented in the pixel domain to a compact representation, also known as latent space. Consequently, the decoder reconstructs the original image in the pixel domain from its latent representation, as accurately as possible. Traditionally, image processing algorithms, and in particular image denoising, are applied to images in the pixel domain before compression, and eventually in some cases as a post-processing stage after decompression. In this context, the combination of denoising operations with the autoencoder might reduce the computational cost while achieving similar performance in accuracy. In this paper, the idea of combining the image denoising task with compression is examined. In particular, the integration of denoising convolutional layers in the decoder of a learning-based compression network is investigated. Results show that, while the rate-distortion performance of the method is slightly reduced, a gain in the computational complexity can be achieved.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Publications associées (3)

Chargement

Chargement

Chargement

Concepts associés (14)

Compression de données

La compression de données ou codage de source est l'opération informatique consistant à transformer une suite de bits A en une suite de bits B plus courte pouvant restituer les mêmes informations, ou

Architecture d'un système

L'architecture d'un système est un modèle conceptuel d'un système qui décrit ses propriétés externes et internes et la manière dont elles se projettent dans ses éléments, leurs relations et les princ

Digital image processing

Digital image processing is the use of a digital computer to process s through an algorithm. As a subcategory or field of digital signal processing, digital image processing has many advantages over .

Efficient representation of geometrical information in images is very important in many image processing areas, including compression, denoising and feature extraction. However, the design of transforms that can capture these geometrical features and represent them with a sparse description is very challenging. Recently, the separable wavelet transform achieved a great success providing a computationally simple tool and allowing for a sparse representation of images. However, in spite of the success, the efficiency of the representation is limited by the spatial isotropy of the wavelet basis functions built in the horizontal and vertical directions as well as the lack of directionality. One-dimensional discontinuities in images (edges and contours), which are very important elements in visual perception, intersect with too many wavelet basis functions leading to a non-sparse representation. To capture efficiently these anisotropic geometrical structures characterized by many more than the horizontal and vertical directions, more flexible multi-directional and anisotropic transforms are required. We present a new lattice-based perfect reconstruction and critically sampled anisotropic multi-directional wavelet transform. The transform retains the separable filtering, subsampling and simplicity of computations and filter design from the standard two-dimensional wavelet transform, unlike in the case of some other existing directional transform constructions (e.g. curvelets, contourlets or edgelets). The corresponding anisotropic basis functions, which we call directionlets, have directional vanishing moments along any two directions with rational slopes. Furthermore, we show that this novel transform provides an efficient tool for non-linear approximation of images, achieving the decay of mean-square error O(N-1.55), which, while slower than the optimal rate O(N-2), is much better than O(N-1) achieved with wavelets, but at similar complexity. Owing to critical sampling, directionlets can easily be applied to image compression since it is possible to use Lagrange optimization as opposed to the case of overcomplete expansions. The compression algorithms based on directionlets outperform the methods based on the standard wavelet transform achieving better numerical results and visual quality of the reconstructed images. Moreover, we have adapted image denoising algorithms to be used in conjunction with an undecimated version of directionlets obtaining results that are competitive with the current state-of-the-art image denoising methods while having lower computational complexity.

Vladan Velisavljevic, Martin Vetterli

The standard separable 2-D wavelet transform (WT) has recently achieved a great success in image processing because it provides a sparse representation of smooth images. However, it fails to efficiently capture 1-D discontinuities, like edges or contours. These features, being elongated and characterized by geometrical regularity along different directions, intersect and generate many large magnitude wavelet coefficients. Since contours are very important elements in the visual perception of images, to provide a good visual quality of compressed images, it is fundamental to preserve good reconstruction of these directional features. In our previous work, we proposed a construction of critically sampled perfect reconstruction transforms with directional vanishing moments imposed in the corresponding basis functions along different directions, called directionlets. In this paper, we show how to design and implement a novel efficient space-frequency quantization (SFQ) compression algorithm using directionlets. Our new compression method outperforms the standard SFQ in a rate-distortion sense, both in terms of mean-square error and visual quality, especially in the low-rate compression regime. We also show that our compression method, does not increase the order of computational complexity as compared to the standard SFQ algorithm.

2007Digital images are becoming increasingly successful thanks to the development and the facilitated access to systems permitting their generation (i.e. camera, scanner, imaging software, etc). A digital image basically corresponds to a 2D discrete set of regularly spaced samples, called pixels, where each pixel contains the light intensity information (e.g., luminance, chrominance) of a very localized spatial region of the image. In the case of natural images, pixel values are acquired through one or several arrays of MOS semiconductors (Charge Couple Devices, CCDs), each generating an electrical information proportional to the incoming light intensity. The initial finalities of digital images were the storage on a dedicated medium (e.g., camera's memory, computer's hard drive, CDROM), eventual transmissions, and final display on a screen or printing. With such a narrow scope, the principal goal of image processing and coding tools was to face storage and transmission bandwidth limitations thanks to efficient compression algorithms reducing the image representation size. However, with recent developments in computing, algorithmic and telecommunication domains, many new applications (i.e. web-publishing, remote browsing etc.) have arisen. They generally require additional and enhanced features (i.e. progressive decoding, random-access, region of interest support, robustness to transmission errors, etc.) and have motivated the creation of a new generation of coding algorithms which, besides their good compression performance, present many other useful features. Hence, digital images are almost never represented as a simple set of pixel values (i.e. raw representation) but, instead, under a specific compact way (i.e. compressed or coded representation), chosen according to features it brings to the considered application. A compressed version of an image is obtained by removing as much spatial, visual and statistical redundancies as possible, thanks to appropriate coding methods, while keeping an acceptable visual quality. Noting that natural images have most of their energy concentrated in low frequency components, recent coding algorithms generally first decompose the image into a specific frequency domain DCT, DWT, etc). The goal is to obtain a representation, where few coefficients are sufficient for reconstructing the image with a good quality. The precision of transformed coefficients is then generally reduced by quantization in order to make them more compressible by an entropy coder, aiming at removing statistical redundancies of quantization indexes. The ultimate compressed representation, called codestream, is usually obtained by a rate-allocation process that tries to achieve the best trade-off between the compression ratio and the reconstructed image quality. JPEG 2000, the new still image coding standard developed by the Joint Photographic Experts Group JPEG), is based on these state-of-the-art compression techniques, but is also designed to fulfill many requirements of recent applications. It only normalizes the decoding algorithm and consequently let some liberty for designing encoders optimized for some specific features or for building extensions that take profit from its compressed domain specifics. The success of the JPEG 2000 standard will not only depend on its intrinsic performance, but also on its ability to comply with specific demands of actual and future imaging applications. Thus, among the major concerns of image content providers are the security of the transmission over networks and of the image itself: with current facilities to instantly access on-line digital libraries from anywhere in the world, to perfectly copy and to easily modify the content of an image, solutions must be found in order to permit, at one hand, Intellectual Property Right (IPR) protection and image integrity verification and, at the other hand, the development of dedicated tools favoring exchange and purchase of images over communication networks. It is worth pointing out that some cryptographic-based solutions to these problems already exist. However, they need to be adapted to the digital image and its compressed domain representation in order to take full advantage from their specifics and to avoid restraining fields of applications. The JPEG 2000 coding algorithm is mainly based on Discrete Wavelet Transform (DWT), embedded scalar quantization and adaptive arithmetic coding. From a terminology point of view, this means that compressed domain representation can indifferently refer to either wavelet coefficients, or quantized wavelet coefficients, or bit streams (i.e. entropy coded group of quantization indexes) or the codestream (i.e. aggregation of bit streams and headers containing necessary decoding information). The choice of the appropriate compressed domain actually depends on the considered application. The quality of a JPEG 2000 image, at a given compression ratio, mainly depends on the rate-allocation procedure used at the encoder side. Such a procedure applies on entropy coded quantization indexes and favors groups of quantization indexes (i.e. code-blocks) offering the best rate-distortion trade-offs. However, this does not necessary correspond to the most interesting part of the image, from an end-observer point of view. Hence, the standard provides with ways to define Regions Of Interest (ROI) which are prioritized during the encoding process in order to exhibit a higher quality than the rest (i.e. background) at any decoding time. This feature applies either in the quantized wavelet domain or at the bit stream level, but its parameters are generally not very explicit for a standard end-user and only provide with a rough control of the decoded ROI quality. Consequently, the first objective of this thesis is to create and also extend compressed domain tools for controlling the quality of a JPEG 2000 ROI. In the meantime, several image processing techniques, such as watermarking, are applied directly in the spatial domain or in a specific transform space defined from the spatial domain. However, since digital images are preferably available under a compressed/encoded format, which we assume to be JPEG 2000 in this thesis, their implementation first imply decompressing the image, then apply the considered processing task and finally re-encode the resulting image. Such a scheme has two main drawbacks: First, it generally implies time and complexity overheads compared to equivalent methods (if they exist) in the JPEG 2000 compressed domain. Such effects become important when the scheme is repeatedly applied on multiple images. Second, since encoding and decoding operations are generally lossy, the introduced distortion can become non negligible whenever several processing tasks are repeated on a same image. These observations lead to the second objective of this thesis, which is to adapt or create a watermarking algorithm dedicated to the JPEG 2000 compressed domain. Finally, there are imaging algorithms, such as authentication and access control, that already exist and could be directly applied to JPEG 2000 images but, because they do not take into account the coding algorithm specifics, they either decrease compression performance or remove useful features of the encoded representation (scalability, random access, etc). This leads to the third objective of this thesis, which is to adapt, create and combine authentication and access control algorithms with JPEG 2000 coding and decoding. Thus, the common goal of the three objectives described above is the deep integration of selected processing and security algorithms into a JPEG 2000 codec in order to provide with minimum complexity, JPEG 2000 compliant codestreams and an unified framework for many imaging applications.