Related publications (41)

Reliable Microsecond-Scale Distributed Computing

Athanasios Xygkis

The landscape of computing is changing, thanks to the advent of modern networking equipment that allows machines to exchange information in as little as one microsecond. Such advancement has enabled microsecond-scale distributed computing, where entire dis ...
EPFL2023

Controlled physics-informed data generation for deep learning-based remaining useful life prediction under unseen operation conditions

Olga Fink, Jian Zhou

Limited availability of representative time-to-failure (TTF) trajectories either limits the performance of deep learning (DL)-based approaches on remaining useful life (RUL) prediction in practice or even precludes their application. Generating synthetic d ...
2023

Recommendation on Live-Streaming Platforms: Dynamic Availability and Repeat Consumption

Karl Aberer, Jérémie Rappaz

Live-streaming platforms broadcast user-generated video in real-time. Recommendation on these platforms shares similarities with traditional settings, such as a large volume of heterogeneous content and highly skewed interaction distributions. However, sev ...
ASSOC COMPUTING MACHINERY2021

Efficient Protocols for Enforcing Causal Consistency in Geo-Replicated Key-Value Data Stores

Kristina Spirovska

Modern large-scale data platforms manage colossal amount of data, generated by the ever-increasing number of concurrent users. Geo-replicated and sharded key-value data stores play a central role when building such platforms. As the strongest consistency m ...
EPFL2020

Fast General Distributed Transactions with Opacity

Aleksandar Dragojevic, Stanko Novakovic, Georgios Chatzopoulos

Transactions can simplify distributed applications by hiding data distribution, concurrency, and failures from the application developer. Ideally the developer would see the abstraction of a single large machine that runs transactions sequentially and neve ...
ASSOC COMPUTING MACHINERY2019

A Minimally Intrusive Low-Memory Approach to Resilience for Existing Transient Solvers

Allan Svejstrup Nielsen

We propose a novel, minimally intrusive approach to adding fault tolerance to existing complex scientific simulation codes, used for addressing a broad range of time-dependent problems on the next generation of supercomputers. Exascale systems have the pot ...
2019

Increasing Availability in Distributed Storage Systems via Clustering

Michael Christoph Gastpar, Saeid Sahraei

We introduce the Fixed Cluster Repair System (FCRS) as a novel architecture for Distributed Storage Systems (DSS) that achieves a small repair bandwidth while guaranteeing a high availability. Specifically, we partition the set of servers in a DSS into s c ...
2018

Experimental Validation of the Suitability of Virtualization-Based Replication for Fault Tolerance in Real-Time Control of Electric Grids

Jean-Yves Le Boudec, Wajeb Saab, Jalal Mostafa, Seyed Alireza Sanaee Kohroudi

Real-time control systems (RTCSs) perform complex control and require low response times. They typically use third-party software libraries and are deployed on generic hardware, which suffer from delay faults that can cause serious damage. To improve avail ...
2018

Reliability Mechanisms for Controllers in Real-Time Cyber-Physical Systems

Maaz Mashood Mohiuddin

Cyber-physical systems (CPSs) are real-world processes that are controlled by computer algorithms. We consider CPSs where a centralized, software-based controller maintains the process in a desired state by exchanging measurements and setpoints with proces ...
EPFL2018

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.