**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Successive Refinement to Caching for Dynamic Content

Résumé

To reduce the network load during peak hours, servers deliver partial data to users during the off-peak time of the network before the actual requests are known, which is known as caching. This paper studies a single user caching problem in which the file contents are subject to dynamic modifications with respect to a certain probability distribution. To cope with the dynamical nature of the file contents, a successive refinement approach to caching is presented: partial information of the original data is cached first and then if there is a modification, a refinement to the previously cached data is delivered to the user. Given a fixed cache memory, there is a tension between the rates of two cache descriptions. The problem of optimal caching strategies is formulated through a successive Gray-Wyner network, the optimal rate region of which is characterized. Some lower and upper bounds on the performance of optimal caching strategies are developed and shown to actually yield closed form solutions for certain classes of file contents.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (13)

Mémoire cache

Une mémoire cache ou antémémoire est, en informatique, une mémoire qui enregistre temporairement des copies de données provenant d'une source, afin de diminuer le temps d'un accès ultérieur (en lect

Cache de processeur

Un cache de processeur est une antémémoire matérielle utilisée par l'unité centrale de traitement (CPU) d'un ordinateur pour réduire le coût moyen (temps ou énergie) de l’accès aux données de la mémo

Request for comments

vignette|Steve Crocker, auteur de la , titrée « Logiciel hôte ».
Les Requests for comments (RFC), littéralement « demandes de commentaires », sont une série numérotée de documents décrivant les aspec

Publications associées (23)

Chargement

Chargement

Chargement

This thesis looks at efficient information processing for two network applications: content delivery with caching and collecting summary statistics in wireless sensor networks. Both applications are studied under the same paradigm: function computation over networks, where distributed source nodes cooperatively communicate some functions of individual observations to one or multiple destinations. One approach that always works is to convey all observations and then let the destinations compute the desired functions by themselves. However, if the available communication resources are limited, then revealing less unwanted information becomes critical. Centered on this goal, this thesis develops new coding schemes using information-theoretic tools.
The first part of this thesis focuses on content delivery with caching. Caching is a technique that facilitates reallocation of communication resources in order to avoid network congestion during peak-traffic times. An information-theoretic model, termed sequential coding for computing, is proposed to analyze the potential gains offered by the caching technique. For the single-user case, the proposed framework succeeds in verifying the optimality of some simple caching strategies and in providing guidance towards optimal caching strategies. For the two-user case, five representative subproblems are considered, which draw connections with classic source coding problems including the Gray-Wyner system, successive refinement, and the Kaspi/Heegard-Berger problem. Afterwards, the problem of distributed computing with successive refinement is considered. It is shown that if full data recovery is required in the second stage of successive refinement, then any information acquired in the first stage will be useful later in the second stage.
The second part of this thesis looks at the collection of summary statistics in wireless sensor networks. Summary statistics include arithmetic mean, median, standard deviation, etc, and they belong to the class of symmetric functions. This thesis develops arithmetic computation coding in order to efficiently perform in-network computation for weighted arithmetic sums and symmetric functions. The developed arithmetic computation coding increases the achievable computation rate from $\Theta((\log L)/L)$ to $\Theta(1/\log L)$, where $L$ is the number of sensors. Finally, this thesis demonstrates that interaction among sensors is beneficial for computation of type-threshold functions, e.g., the maximum and the indicator function, and that a non-vanishing computation rate is achievable.

This thesis is devoted to information-theoretic aspects of community detection. The importance of community detection is due to the massive amount of scientific data today that describes relationships between items from a network, e.g., a social network. Items from this network can be inherently partitioned into a known number of communities, but the partition can only be inferred from the data.
To estimate the underlying partition, data scientists can apply any type of advanced statistical techniques; but the data could be very noisy, or the number of data is inadequate. A fundamental question here is about the possibility of weak recovery: does the data contain a sufficient amount of information that enables us to produce a non-trivial estimate of the partition?
For the purpose of mathematical analysis, the above problem can be formulated as Bayesian inference on generative models. These models, including the stochastic block model (SBM) and censored block model (CBM), consider a random graph generated based on a hidden partition that divides the nodes in the graph into labelled groups. In the SBM, nodes are connected with a probability depending on the labels of the endpoints. Whereas, in the CBM, hidden variables are measured through a noisy channel, and the measurement outcomes form a weighted graph. In both models, inference is the task of recovering the hidden partition from the observed graph. The criteria for weak recovery can be studied via an information-theoretic quantity called mutual information. Once the asymptotic mutual information is computed, phase transitions for the weak recovery can be located.
This thesis pertains to rigorous derivations of single-letter variational expressions for the asymptotic mutual information for models in community detection. These variational expressions, known as the replica predictions, come from heuristic methods of statistical physics. We present our development of new rigorous methods for confirming the replica predictions. These methods are based on extending the recently introduced adaptive interpolation method.
We prove the replica prediction for the SBM in the dense-graph regime with two groups of asymmetric size. The existing proofs in the literature are indirect, as they involve mapping the model to an external problem whose mutual information is determined by a combination of methods. Here, on the contrary, we provide a self-contained and direct proof.
Next, we extend this method to sparse models. Before this thesis, adaptive interpolation was known for providing a conceptually simple proof for replica predictions for dense graphs. Whereas, for a sparse graph, the replica prediction involves a more complicated variational expression, and rigorous confirmations are often lacking or obtained by rather complicated methods. Therefore, we focus on a simple version of CBM on sparse graphs, where hidden variables are measured through a binary erasure channel, for which we fully prove the replica prediction by the adaptive interpolation.
The key for extending the adaptive interpolation to a broader class of sparse models is a concentration result for the so-called "multi-overlaps". This concentration forms the basis of the replica "symmetric" prediction. We prove this concentration result for a related sparse model in the context of physics. This provides inspiration for further development of the adaptive interpolation.

Michael Christoph Gastpar, Erixhen Sula

In the problem of coded caching for media delivery, two separate coding opportunities have been identified. The first opportunity is a multi-user advantage and crucially hinges on a public broadcast link in the delivery phase. This has been explored in a plethora of works. The second opportunity has received far less attention and concerns similarities between files in the database. Here, the paradigm is to cache "the similarity" between the files. Upon the request, the encoder refines this by providing the specific details for the requested files. Extending Gray and Wyner's work (1974), it follows that the right measure of file similarity is Wyner's Common Information and its generalizations. The present paper surveys and extends the role of Wyner's Common Information in caching. As a novel result, explicit solutions are found for the Gaussian case under mean-squared error, both for the caching problem as well as for the network considered by Gray and Wyner. Our solution leverages and extends the recent technique of factorization of convex envelopes.