GPGPU-Accelerated Parallel and Fast Simulation of Thousand-core Platforms
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Fine-grain data parallelism is increasingly common in mainstream processors in the form of long vectors and on-chip GPUs. This paper develops compiler and runtime support to exploit such data parallelism for non-numeric, non-graphic, irregular parallel tas ...
Chip-multiprocessors require a coherence directory to track data sharing and order accesses to the shared data. Scaling coherence directories to support a large number of cores is challenging due to excessive area requirements of the directories. The state ...
This paper presents a low-latency algorithm designed for parallel computer architectures to compute the scalar multiplication of elliptic curve points based on approaches from cryptographic side-channel analysis. A graphics processing unit implementation u ...
The parallel implementation of MUPHY, a concurrent multiscale code for large-scale hemodynamic simulations in anatomically realistic geometries, for multi-GPU platforms is presented. Performance tests show excellent results, with a nearly linear parallel s ...
Dynamic architectures in which interactions between components can evolve during execution, are essential for modern computing systems such as web-based systems, reconfigurable middleware, wireless sensor networks and fault-tolerant systems. Currently, we ...
New generations of multi-core processors and reconfigurable hardware platforms are expected to provide a dramatic increase of processing capabilities. However, one obstacle for exploiting all the promises of such new platforms is the legacy of current appl ...
Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa2011
New generations of multi-core processors and reconfigurable hardware platforms are expected to provide a dramatic increase of processing capabilities. However, one obstacle for exploiting all the promises of such new platforms is the legacy of current appl ...
Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa2011
Future architectures will feature hundreds to thousands of simple processors and on-chip memories connected through a network-on-chip. Architectural simulators will remain primary tools for design space exploration, performance (and power) evaluation of th ...
Contention for shared resources—caches, memory controllers, buses, NICs—is assumed to be a hurdle in optimizing and predicting the performance of multi-core software systems, especially packet-processing systems, which make extensive use of these resources ...
Aggressive memory-level-parallelism techniques have provided significant performance gain in Distributed Share Memory Designs. In this paper, we reevaluate speculative memory ordering in the context of Chip Multi-Processors (CMPs) and power-limited computa ...