Current online applications, such as search engines, social networks, or file sharing services, execute across a distributed network of machines. They provide non-stop services to their users despite failures in the underlying network. To achieve such a hi ...
EPFL2019
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Large efforts have been devoted in the past to understanding size effect in shear failures of members without transverse reinforcement. Experimental works have demonstrated that increasing the size reduces the nominal shear strength provided that the failu ...
We propose a novel, minimally intrusive approach to adding fault tolerance to existing complex scientific simulation codes, used for addressing a broad range of time-dependent problems on the next generation of supercomputers. Exascale systems have the pot ...
Distributed skyline computation is important for a wide range of domains, from distributed and web-based systems to ISP-network monitoring and distributed databases. The problem is particularly challenging in dynamic distributed settings, where the goal is ...
Clock-SI is a fully distributed protocol that implements snapshot isolation (SI) for partitioned data stores. It derives snapshot and commit timestamps from loosely synchronized clocks, rather than from a centralized timestamp authority as used in current ...
Modern data storage systems are extremely large and consist of several tens or hundreds of nodes. In such systems, node failures are daily events, and safeguarding data from them poses a serious design challenge. The focus of this thesis is on the data rel ...
State-machine replication (SMR) is a software technique for tolerating failures and for providing high availability in large-scale systems, through the use of commodity hardware. A replicated state-machine comprises a number of replicas, each of which runs ...
Many atomic broadcast algorithms have been published in the last 20 years. Token-based algorithms represent a large class of these algorithms. Interestingly, all the token-based atomic broadcast algorithms rely on a group membership service and none of the ...
The use of transactions in distributed systems dates back to the 70's. The last decade has also seen the proliferation of transactional systems. In the existing transactional systems, many protocols employ a centralized approach in executing a distributed ...
Huge scale machine learning problems are nowadays tackled by distributed optimization algorithms, i.e. algorithms that leverage the compute power of many devices for training. The communication overhead is a key bottleneck that hinders perfect scalability. ...