Concept

Directory-based cache coherence

In computer engineering, directory-based cache coherence is a type of cache coherence mechanism, where directories are used to manage caches in place of bus snooping. Bus snooping methods scale poorly due to the use of broadcasting. These methods can be used to target both performance and scalability of directory systems. In the full bit vector format, for each possible cache line in memory, a bit is used to track whether every individual processor has that line stored in its cache. The full bit vector format is the simplest structure to implement, but the least scalable. The SGI Origin 2000 uses a combination of full bit vector and coarse bit vector depending on the number of processors. Each directory entry must have 1 bit stored per processor per cache line, along with bits for tracking the state of the directory. This leads to the total size required being (number of processors)×number of cache lines, having a storage overhead ratio of (number of processors)/(cache block size×8). It can be observed that directory overhead scales linearly with the number of processors. While this may be fine for a small number of processors, when implemented in large systems the size requirements for the directory becomes excessive. For example, with a block size of 32 bytes and 1024 processors, the storage overhead ratio becomes 1024/(32×8) = 400%. The coarse bit vector format has a similar structure to the full bit vector format, though rather than tracking one bit per processor for every cache line, the directory groups several processors into nodes, storing whether a cache line is stored in a node rather than a line. This improves size requirements at the expense of bus traffic saving (processors per node)×(total lines) bits of space. Thus the ratio overhead is the same, just replacing number of processors with number of processor groups. When a bus request is made for a cache line that one processor in the group has, the directory broadcasts the signal into every processor in the node rather than just the caches that contain it, leading to unnecessary traffic to nodes that do not have the data cached.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (4)
CS-200: Computer architecture
This course completes the overview of computer architecture started on the first year (CS-173 DSD).
CS-307: Introduction to multiprocessor architecture
Multiprocessors are a core component in all types of computing infrastructure, from phones to datacenters. This course will build on the prerequisites of processor design and concurrency to introduce
CS-471: Advanced multiprocessor architecture
Multiprocessors are basic building blocks for all computer systems. This course covers the architecture and organization of modern multiprocessors, prevalent accelerators (e.g., GPU, TPU), and datacen
Show more
Related lectures (20)
Multiprocessors: Overview and Challenges
Covers the evolution and challenges of multiprocessors, emphasizing energy efficiency, parallel programming, cache coherence, and the role of GPUs.
Cache Coherence
Covers cache coherence, sharing data in multiprocessor caches, directories, and coherence protocols.
Cache Coherence: Basics and Protocols
Explores cache coherence challenges, protocols, and directory-based solutions in multi-core systems.
Show more
Related publications (32)
Related concepts (1)
CPU cache
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which stores copies of the data from frequently used main memory locations. Most CPUs have a hierarchy of multiple cache levels (L1, L2, often L3, and rarely even L4), with different instruction-specific and data-specific caches at level 1.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.