Publication

An integrated circuit/architecture approach to reducing leakage in deep-submicron high-performance I-caches

Publications associées (84)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

DRAM Based on Hysteresis in Impact Ionization Single-Transistor-Latch

Mihai Adrian Ionescu, Maher Kayal, Didier Bouvet, Kirsten Emilie Moselund, Cédric Meinen, Vincent Pott

This work reports on memory applications of punch-through impact ionization single-transistor latch (PIMOS), showing abrupt current switching (3-10mV/dec.) as well as hysteresis in both ID(VDS) and ID(VGS). A capacitor-less 1PIMOS - 1 MOSFET DRAM memory is ...

2008

Temporal instruction fetch streaming

Anastasia Ailamaki, Babak Falsafi, Michael Ferdman

L1 instruction-cache misses pose a critical performance bottleneck in commercial server workloads. Cache access latency constraints preclude L1 instruction caches large enough to capture the application, library, and OS instruction working sets of these wo ...

2008

Scheduling threads for constructive cache sharing on CMPs

Anastasia Ailamaki, Babak Falsafi

In chip multiprocessors (CMPs), limiting the number of offchip cache misses is crucial for good performance. Many multithreaded programs provide opportunities for constructive cache sharing, in which concurrently scheduled threads share a largely overlappi ...

2007

Last-touch correlated data streaming

Babak Falsafi, Michael Ferdman

Recent research advocates address-correlating predictors to identify cache block addresses for prefetch. Unfortunately, address-correlating predictors require correlation data storage proportional in size to a program's active memory footprint. As a result ...

2007

Improving instruction cache performance in OLTP

Anastasia Ailamaki

Instruction-cache misses account for up to 40%; of execution time in online transaction processing (OLTP) database workloads. In contrast to data cache misses, instruction misses cannot be overlapped with out-of-order execution. Chip design limitations do ...

Association for Computing Machinery2006

ReCast: Boosting tag line buffer coverage in low-power high-level caches "for free"

Babak Falsafi

We revisit the idea of using small line buffers in-front of caches. We propose ReCast, a tiny tag set cache that filters a significant number of tag probes to the L2 tag array thus reducing power. The key contribution in ReCast is S-Shift, a simple indexin ...

2005

A case for asymmetric-cell cache memories

Babak Falsafi

In this paper, we make the case for building high-performance asymmetric-cell caches (ACCs) that employ recently-proposed asymmetric SRAMs to reduce leakage proportionally to the number of resident zero bits. Because ACCs target memory value content (indep ...

2005

Accurate and complexity-effective spatial pattern prediction

Babak Falsafi

Recent research suggests that there are large variations in a cache's spatial usage, both within and across programs. Unfortunately, conventional caches typically employ fixed cache line sizes to balance the exploitation of spatial and temporal locality, a ...

2004

Dynamically Trading Frequency for Complexity in a GALS Microprocessor

Microprocessors are traditionally designed to provide “best overall” performance across a wide range of applications and operating environments. Several groups have proposed hardware techniques that save energy by “downsizing” hardware resources that are u ...

2004

Hiding Synchronization Delays in a GALS Processor Microarchitecture

Sandhya Dwarkadas

We analyze an Alpha 21264-like Globally–Asynchronous, Locally–Synchronous (GALS) processor organized as a Multiple Clock Domain (MCD) microarchitecture and identify the architectural features of the processor that influence the limited performance degradat ...

2004