**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover

Résumé

We present O(log log n)-round algorithms in the Massively Parallel Computation (MPC) model, with a(n) memory per machine, that compute a maximal independent set, a 1 + epsilon approximation of maximum matching, and a 2 + epsilon approximation of minimum vertex cover, for any n-vertex graph and any constant epsilon > 0. These improve the state of the art as follows: Our MIS algorithm leads to a simple O(log log A)-round MIS algorithm in the CONGESTED-CLIQUE model of distributed computing, which improves on the (O) over tilde(root log Delta)-round algorithm of Ghaffari [PODC'17]. Our O(log log(2) n)-round (1 + epsilon)-approximate maximum matching algorithm simplifies or improves on the following prior work: O(log2 log n)-round (1 + epsilon)-approximation algorithm of Czumaj et al. [STOC'18] and O(log log n)-round (1 + epsilon) approximation algorithm of Assadi et al. [arXiv'17]. Our O(log log n)-round (2 + epsilon)-approximate minimum vertex cover algorithm improves on an O(log log n)-round O(1)-approximation of Assadi et al. [arXiv'17].

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Publications associées (6)

Chargement

Chargement

Chargement

Concepts associés (11)

Computing

Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and developm

Algorithme d'approximation

En informatique théorique, un algorithme d'approximation est une méthode permettant de calculer une solution approchée à un problème algorithmique d'optimisation. Plus précisément, c'est une heuristiq

Approximation

Une approximation est une représentation imprécise ayant toutefois un lien étroit avec la quantité ou l’objet qu’elle reflète : approximation d’un nombre (de π par 3,14, de la vitesse instantanée d’un

Rachid Guerraoui, Eric Ruppert

The vast majority of papers on distributed computing assume that processes are assigned unique identifiers before computation begins. But is this assumption necessary? What if processes do not have unique identifiers or do not wish to divulge them for reasons of privacy? Here, we consider asynchronous shared-memory systems that are anonymous, which means processes do not have identifiers, and are programmed identically. The shared memory contains only the most common type of shared objects, read/write registers. We investigate what can be computed deterministically in this model when any number of processes can experience crash failures. Thus, an adversary controls the speeds of processes and the failures in the system. Typically algorithms designed for such systems guarantee (partial) correctness in all possible executions, and there are various types of termination (or progress) properties that have been studied: wait-freedom, lock-freedom and obstruction-freedom. We give anonymous algorithms for some of the most important problems in the theory of distributed computing: timestamping, atomic snapshots and consensus. Our solutions to the first two are wait-free and the third is obstruction-free. We also show that a shared data structure has an obstruction-free implementation in our model if and only if it satisfies a simple property called idempotence. To prove the sufficiency of this condition, we give a universal construction that implements any idempotent data structure.

2004According to the World Health Organization, lifestyle-related diseases, e.g., cardiovascular diseases are the major cause of mortality worldwide. An accurate and continuous medical supervision is highly required for diagnosis and treatment of such diseases. Our traditional healthcare delivery systems, however can’t cope with consequential increasing healthcare costs and medical management needs. Personal health monitoring systems are poised to offer large-scale and cost-effective solutions to this problem. The use of wearable, miniaturized and autonomous wireless sensor nodes, featuring continuous on-node analysis of biosignals, can indeed provide ambulatory long-term and real-time monitoring required by the patients, and enables faster coordination with medical personnel. In such autonomous nodes, due to very limited available energy resources and costly wireless transmission, an ultra-low-power (ULP) on-node processing platform for advanced biosignal analysis is crucial. In this thesis, I explore ULP processing architectures for on-node biosignal analysis applications; where commonly, moderately complex arithmetic manipulations on single- or multiple- input signals are carried out. To achieve energy efficiency while providing sufficient processing capability to apply advanced biosignal analysis, in this thesis near-threshold (near-Vt h ) computing is exploited. Hence, severe performance degradation and reliability issues, occurring at deeply scaled voltages, can be avoided. In Chapter 3, I introduce a near-Vth computing single-core architecture, consisting of a ULP core, an instruction memory (IM) and a data memory (DM). The ULP core features an instruction set architecture (ISA) customized for biosignal applications. I explore that an ISA with minimal instruction set achieves considerable energy savings compared to the state-of-the-art cores, when executing biosignal applications (i.e., up to 54% compared to an established ISA). The proposed single-core architecture accomplishes high energy efficiency for most of single-input biosignal analysis applications, since it fully exploits near-Vth computing. However, the single-core architecture achieves limited voltage scaling, hence reduced energy awareness, for most of multiple-input biosignal analysis applications, where computational workload requirements are such high that the single-core architecture can’t attain these throughputs in near-Vth regime. To alleviate the performance degradation issue that prevents the single-core architecture from exploiting near-Vt h computing typically for multiple-input biosignal analysis, I propose parallel processing of biosignals on multi-core architectures. To this end, In Chapter 4, a multiple instruction, multiple data (MIMD) multi-core architecture is introduced. The MIMD architecture comprises several ULP cores, individual IMs, and a multi-bank DM shared through a lightweight interconnect between the cores and the DM. I prove that parallel processing of multiple-input biosignals leads to better energy efficiency than the sequential processing (i.e., on a single-core) for moderate and high biosignal workloads. In particular, the MIMD architecture achieves up to 62% power savings with respect to the single-core architecture for high biosignal workloads (i.e., 167 MOps/s). On the other hand, parallel processing of multiple-input biosignals can be penalized at low workloads due to high leakage power dissipation in multi-core architectures. In particular, the MIMD architecture fails against the single-core architecture in terms of energy efficiency for workloads lighter than 1.7 MOps/s. One of the major burden of power dissipation in MIMD architectures is costly multiple instruction fetch. To mitigate this issue, I propose data-level parallelism through single instruction, multiple data (SIMD) paradigm. To this end, in Chapter 4 a novel hybrid multi-core architecture, that supports SIMD and MIMD operations, is introduced. The SIMD operations, coupled with data and instruction broadcasting, enable coordinated multiple accesses to memories, hence reduced instruction fetch power. Additionally, the hybrid multi-core architecture features partial power gating of memories to achieve leakage power savings, vital at low workloads (a few 100 kOps/s). I show that SIMD processing of multiple-input biosignals leads to better energy efficiency compared to the MIMD processing. In particular, when SIMD operations are exploited, the hybrid multi-core architecture achieves up to 45.7% power saving compared to the MIMD architecture for moderate biosignal workloads. I also ascertain that partial power gating of memories is an effective technique to alleviate leakage issue in multi-core architectures. More specifically, partial power gating of the IM in the hybrid multi-core architecture leads to 38.8% power saving at low workloads. Finally, to alleviate issues with applications involving such program parts that limit SIMD execution of applications (i.e., conditional program parts), I propose to resynchronize the cores for stable lockstep code execution in case of synchronization loss. Hence, SIMD operations are exploited even for applications with conditional program parts. To this end, in Chapter 4 a lightweight software-directed hardware synchronizer is introduced. I reveal that for applications with conditional program parts, lockstep SIMD execution accomplishes up to 64% power saving with respect to the elementary SIMD execution at moderate workloads (i.e.,89 MOps/s).

Mohsen Ghaffari, Slobodan Mitrovic

We present O(log log n)-round algorithms in the Massively Parallel Computation (MPC) model, with a(n) memory per machine, that compute a maximal independent set, a 1 + epsilon approximation of maximum matching, and a 2 + epsilon approximation of minimum vertex cover, for any n-vertex graph and any constant epsilon > 0. These improve the state of the art as follows: Our MIS algorithm leads to a simple O(log log A)-round MIS algorithm in the CONGESTED-CLIQUE model of distributed computing, which improves on the (O) over tilde(root log Delta)-round algorithm of Ghaffari [PODC'17]. Our O(log log(2) n)-round (1 + epsilon)-approximate maximum matching algorithm simplifies or improves on the following prior work: O(log2 log n)-round (1 + epsilon)-approximation algorithm of Czumaj et al. [STOC'18] and O(log log n)-round (1 + epsilon) approximation algorithm of Assadi et al. [arXiv'17]. Our O(log log n)-round (2 + epsilon)-approximate minimum vertex cover algorithm improves on an O(log log n)-round O(1)-approximation of Assadi et al. [arXiv'17].