High-Performance Communication Primitives and Data Structures on Message-Passing Manycores
Related publications (127)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Programming parallel shared- and distributed-memory architectures remains a difficult task. This contribution proposes a methodology for the hierarchical specification of pipelined parallel applications running on shared- as well as distributed-memory arch ...
A device interface for communicating between a processor system and a separate device employs cacheable control registers, both to indicate the receipt of a message and to receive messages to be transmitted. The data structure of the cacheable control regi ...
Supercomputers can expand greatly the possibililies for modelling subsurface solute transpot. The theory relating to a specific application, viz., multicomponent solute transport, is presented. It is demonstrated that multidimensional realwor1d problems in ...
On a distributed memory machine, hand-coded message passing leads to the most efficient execution, but it is difficult to use. Parallelizing compilers can approach the performance of hand-coded message passing by translating data-parallel programs into mes ...
We compare two systems for parallel programming on networks of workstations: Parallel Virtual Machine (PVM) a message passing system, and TreadMarks, a software distributed shared memory (DSM) system. We present results for eight applications that were imp ...
The message passing programs are executed with the Parallel Virtual Machine (PVM) library and the shared memory programs are executed using TreadMarks. The programs are Water and Barnes-Hut from the SPLASH benchmark suite; 3-D FFT, Integer Sort (IS) and Em ...
We believe the paucity of massively parallel, shared-memory machines follows from the lack of a shared-memory programming performance model that can inform programmers of the cost of operations (so they can avoid expensive ones) and can tell hardware desig ...
In this paper, we present the first system that implements OpenMP on a network of shared-memory multiprocessors. This system enables the programmer to rely on a single, standard, shared-memory API for parallelization within a multiprocessor and between mul ...
The paper describes Tempest, a collection of mechanisms for communication and synchronization in parallel programs. With these mechanisms, authors of compilers, libraries, and application programs can exploit-across a wide range of hardware platforms-the b ...
Research on artificial neural networks (ANNs) has been carried out for more than five decades. A renewed interest appeared in the 80's with the finding of powerful models like J. Hopfield's recurrent networks, T. Kohonen's self-organizing feature maps, and ...