Publication

Miss Rate Prediction across All Program Inputs

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Reactive NUCA: Near-Optimal Block Placement and Replication in Distributed Caches

Anastasia Ailamaki, Babak Falsafi, Michael Ferdman

Increases in on-chip communication delay and the large working sets of server and scientific workloads complicate the design of the on-chip last- level cache for multicore processors. The large working sets favor a shared cache design that maximizes the ag ...

2009

Way Stealing: Cache-assisted Automatic Instruction Set Extensions

Edoardo Charbon, Paolo Ienne, Ties Jan Henderikus Kluter, Philip Brisk

This paper introduces Way Stealing, a simple architectural modification to a cache-based processor to increase data bandwidth to and from application-specific Instruction Set Extensions (ISEs). Way Stealing provides more bandwidth to the ISE-logic than the ...

Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa2009

Temporal instruction fetch streaming

Anastasia Ailamaki, Babak Falsafi, Michael Ferdman

L1 instruction-cache misses pose a critical performance bottleneck in commercial server workloads. Cache access latency constraints preclude L1 instruction caches large enough to capture the application, library, and OS instruction working sets of these wo ...

2008

Miss Rate Prediction across Program Inputs and Cache Configurations

Chen Ding

Improving cache performance requires understanding cache behavior. However, measuring cache performance for one or two data input sets provides little insight into how cache behavior varies across all data input sets and all cache configurations. This pape ...

2007

Improving instruction cache performance in OLTP

Anastasia Ailamaki

Instruction-cache misses account for up to 40%; of execution time in online transaction processing (OLTP) database workloads. In contrast to data cache misses, instruction misses cannot be overlapped with out-of-order execution. Chip design limitations do ...

Association for Computing Machinery2006

ReCast: Boosting tag line buffer coverage in low-power high-level caches "for free"

Babak Falsafi

We revisit the idea of using small line buffers in-front of caches. We propose ReCast, a tiny tag set cache that filters a significant number of tag probes to the L2 tag array thus reducing power. The key contribution in ReCast is S-Shift, a simple indexin ...

2005

Accurate and complexity-effective spatial pattern prediction

Babak Falsafi

Recent research suggests that there are large variations in a cache's spatial usage, both within and across programs. Unfortunately, conventional caches typically employ fixed cache line sizes to balance the exploitation of spatial and temporal locality, a ...

2004

Memory coherence activity prediction in commercial workloads

Anastasia Ailamaki, Babak Falsafi

Recent research indicates that prediction-based coherence optimizations offer substantial performance improvements for scientific applications in distributed shared memory multiprocessors. Important commercial applications also show sensitivity to coherenc ...

2004

Parallelization and scheduling of data intensive particle physics analysis jobs on clusters of PCs

Roger Hersch, Sébastien Ponce

Scheduling policies are proposed for parallelizing data intensive particle physics analysis applications on computer clusters. Particle physics analysis jobs require the analysis of tens of thousands of particle collision events, each event requiring typic ...

IEEE Computer Society, Los Alamitos;Massey University, Palmerston, CA 90720-1314, United States;New Zealand2004

Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay

Babak Falsafi

Cache memories account for a significant fraction of a chip's overall energy dissipation. Recent research advocates using "resizable" caches to exploit cache requirement variability in applications to reduce cache size and eliminate energy dissipation in t ...

2002