Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Digital media have become an integral part of modern lives. Whether surfing the web, making a wireless phone call, watching satellite TV, or listening to digital music, a large part of our professional and leisure time is filled with all things digital. The replacement of analog media by their digital counterparts and the explosion of Internet use has had a perhaps unintended consequence. Whereas analog media were previously replaced by digital media mostly only to preserve quality, the existence of high speed computer networks makes digital media available to potentially anyone, anywhere, and at any time. This possibility is the basis for modern scientific and economic developments centered around the distribution of digital data to a worldwide audience. The success of web sites like Apple's iTunes store or YouTube is rooted in the marriage of digital data and the Internet. Reliable transport of digital media to heterogeneous clients becomes thus a central and at time critical issue. Receivers can be anywhere and they tray be connected to networks with widely differing fidelities. In this paper we will give a soft introduction into a new method for solving the data distribution problem. We take four fundamental data transmission problems as examples: delivery of data from one sender to one receiver over a long distance, delivery of data from one sender to multiple receivers, delivery of the same data from multiple senders to one receiver, and finally, delivery of data from many senders to many receivers. Examples of such data transmission scenarios are abundant: the first one is encountered whenever a large piece of data is downloaded from a distant location; satellite data distribution, or distribution of data to mobile receivers is a prune example of the second scenario. The application space for the third example is emerging, and includes scenarios like disaster recovery: data is replicated across multiple servers and accessed simultaneously from these servers. A prime example for the fourth scenario is the popular peer-to-peer data distribution. We argue that current data transmission protocols are not adequate to solve these data distribution problems, and hence lack the ability to solve some of today's and many of tomorrow's data delivery problems. This is because these transmission protocols were designed at a time when the Internet was still in its infancy, and the problem of bulk data distribution was not high on the agenda,. We then introduce fountain codes and show brow they can be used to solve all of these data transmission problems at the same time. For a given piece of content, a fountain code produces a potentially limitless stream of data such that any subset of this data of size essentially equal to the original content is sufficient to recover the original data. Just like the case of filling a glass of water under a fountain where it does not matter which particular drops fill the glass, with a fountain code it does not matter which particular pieces of output data are received, as long as their cumulative size is right. We introduce a very simple, but inefficient, fountain code and refine it to LT-codes, the first class of efficient fountain codes, and then to Raptor codes, the state-of-the-art in this field. We discuss tools that allow us to design these fountains, and analyze their performance. We also briefly discuss Raptor codes that are standardized for various data transmission scenarios.
Alcherio Martinoli, Chiara Ercolani, Lixuan Tang, Ankita Arun Humne