On the Performance of Delegation over Cache-Coherent Shared Memory
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Transactional memory is an appealing paradigm for concurrent programming. Many software implementations of the paradigm were proposed in the last decades for both shared memory multi-core systems and clusters of distributed machines. However, chip manufact ...
Caches are designed to provide the best tradeoff between access speed and capacity for a set of target applications. Unfortunately, different applications, and even different phases within the same application, may require a different capacity-speed tradeo ...
Customizable processors augmented with application-specific Instruction Set Extensions (ISEs) have begun to gain traction in recent years. The most effective ISEs include Architecturally Visible Storage (AVS), compiler-controlled memories accessible exclus ...
Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa2010
, , , ,
Modern hardware is abundantly parallel and increasingly heterogeneous. The numerous processing cores have non-uniform access latencies to the main memory and to the processor caches, which causes variability in the communication costs. Unfortunately, datab ...
2012
, , ,
This paper advocates the placement of Architecturally Visible Communication (AVC) buffers between adjacent cores in MPSoCs to provide high-throughput communication for streaming applications. Producer/consumer relationships map poorly onto cache-based MPSo ...
Coherence misses in shared-memory multiprocessors account for a substantial fraction of execution time in many important scientific and commercial workloads. Memory streaming provides a promising solution to the coherence miss bottleneck because it improve ...
To protect processor logic from soft errors, multicore redundant architectures execute two copies of a program on separate cores of a chip multiprocessor (CMP). Maintaining identical instruction streams is challenging because redundant cores operate indepe ...
Coherent read misses in shared-memory multiprocessors account for a substantial fraction of execution time in many important scientific and commercial workloads. We propose Temporal Streaming, to eliminate coherent read misses by streaming data to a proces ...
Set-associative caches achieve low miss rates for typical applications but result in significant energy dissipation. Set-associative caches minimize access time by probing all the data ways in parallel with the tag lookup, although the output of only the m ...
Communication in cache-coherent distributed shared memory (DSM) often requires invalidating (or writing back) cached copies of a memory block, incurring high overheads. This paper proposes Last-Touch Predictors (LTPs) that learn and predict the “last touch ...