Efficient Communication and Synchronization on Manycore Processors
Publications associées (121)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
RePlAce is a state-of-the-art prototype of a flat, analytic, and nonlinear global cell placement algorithm, which models a placement instance as an electrostatic system with positively charged objects. It can handle large-scale standard-cell and mixed-cell ...
IEEE2020
,
Modern asynchronous runtime systems allow the re-thinking of large-scale scientific applications. With the example of a simulator of morphologically detailed neural networks, we show how detaching from the commonly used bulk-synchronous parallel (BSP) exec ...
SPRINGER INTERNATIONAL PUBLISHING AG2019
,
Atomic lock-free multi-word compare-and-swap (MCAS) is a powerful tool for designing concurrent algorithms. Yet, its widespread usage has been limited because lock-free implementations of MCAS make heavy use of expensive compare-and-swap (CAS) instructions ...
Schloss Dagstuhl, Leibniz-Zentrum2020
, , ,
Investigative journalists collect large numbers of digital documents during their investigations. These documents can greatly benefit other journalists' work. However, many of these documents contain sensitive information. Hence, possessing such documents ...
USENIX ASSOC2020
Database systems access memory either sequentially or randomly. Contrary to sequential access and despite the extensive efforts of
computer architects, compiler writers, and system builders, random access to data larger than the processor cache has been s ...
EPFL2019
,
Non-Volatile Memory (NVM) technologies exhibit 4× the read access latency of conventional DRAM. When the working set does not fit in the processor cache, this latency gap between DRAM and NVM leads to more than 2× runtime increase for queries dominated by ...
We propose a novel, minimally intrusive approach to adding fault tolerance to existing complex scientific simulation codes, used for addressing a broad range of time-dependent problems on the next generation of supercomputers. Exascale systems have the pot ...
As hardware evolves, so do the needs of applications. To increase the performance of an application,
there exist two well-known approaches. These are scaling up an application, using a larger multi-core
platform, or scaling out, by distributing work to mul ...
We introduce a new distributed computing model called m&m that allows processes to both pass messages and share memory. Motivated by recent hardware trends, we find that this model improves the power of the pure message-passing and shared-memory models. As ...
Modern server hardware is increasingly heterogeneous as hardware accelerators, such as GPUs, are used together with multicore CPUs to meet the computational demands of modern data analytics workloads. Unfortunately, query parallelization techniques used by ...