In computing, cache replacement policies (also frequently called cache replacement algorithms or cache algorithms) are optimizing instructions, or algorithms, that a computer program or a hardware-maintained structure can utilize in order to manage a cache of information stored on the computer. Caching improves performance by keeping recent or often-used data items in memory locations that are faster or computationally cheaper to access than normal memory stores. When the cache is full, the algorithm must choose which items to discard to make room for the new ones.
TOC
The average memory reference time is
where
= miss ratio = 1 - (hit ratio)
= time to make a main memory access when there is a miss (or, with multi-level cache, average memory reference time for the next-lower cache)
= the latency: the time to reference the cache (should be the same for hits and misses)
= various secondary effects, such as queuing effects in multiprocessor systems
There are two primary figures of merit of a cache:
the latency and the hit ratio. There are also a number of secondary factors affecting cache performance.
The "hit ratio" of a cache describes how often a searched-for item is actually found in the cache. More efficient replacement policies keep track of more usage information in order to improve the hit rate (for a given cache size).
The "latency" of a cache describes how long after requesting a desired item the cache can return that item (when there is a hit).
Faster replacement strategies typically keep track of less usage information—or, in the case of direct-mapped cache, no information—to reduce the amount of time required to update that information.
Each replacement strategy is a compromise between hit rate and latency.
Hit rate measurements are typically performed on benchmark applications. The actual hit ratio varies widely from one application to another. In particular, video and audio streaming applications often have a hit ratio close to zero, because each bit of data in the stream is read once for the first time (a compulsory miss), used, and then never read or written again.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Multiprocessors are a core component in all types of computing infrastructure, from phones to datacenters. This course will build on the prerequisites of processor design and concurrency to introduce
Multiprocessors are now the defacto building blocks for all computer systems. This course will build upon the basic concepts offered in Computer Architecture I to cover the architecture and organizati
We introduce formal verification as an approach for developing highly reliable systems. Formal verification finds proofs that computer systems work under all relevant scenarios. We will learn how to u
In computing, a page fault (sometimes called PF or hard fault) is an exception that the memory management unit (MMU) raises when a process accesses a memory page without proper preparations. Accessing the page requires a mapping to be added to the process's virtual address space. Besides, the actual page contents may need to be loaded from a backing store, such as a disk. The MMU detects the page fault, but the operating system's kernel handles the exception by making the required page accessible in the physical memory or denying an illegal memory access.
In computer science, program optimization, code optimization, or software optimization, is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources. In general, a computer program may be optimized so that it executes more rapidly, or to make it capable of operating with less memory storage or other resources, or draw less power. Although the word "optimization" shares the same root as "optimal", it is rare for the process of optimization to produce a truly optimal system.
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference locality - temporal and spatial locality. Temporal locality refers to the reuse of specific data and/or resources within a relatively small time duration. Spatial locality (also termed data locality) refers to the use of data elements within relatively close storage locations.
Explores cache memory design, hits, misses, and eviction policies in computer systems, emphasizing spatial and temporal locality.
Focuses on using Stainless for program verification, demonstrating the process of verifying programs and ensuring correctness.
Covers online algorithms and competitive ratio in caching strategies to optimize memory usage.
The increasing demand for computing power and the emergence of heterogeneous computing architectures have driven the exploration of innovative techniques to address current limitations in both the compute and memory subsystems. One such solution is the use ...
2024
, ,
Analytical engines rely on in-memory data caching to avoid storage accesses and provide timely responses by keeping the most frequently accessed data in memory. Purely frequency- and time-based caching decisions, however, are a proxy of the expected query ...
Numerical simulations have become an indispensable tool in astrophysics and cosmology. The constant need for higher accuracy, higher resolutions, and models ofever-increasing sophistication and complexity drives the development of modern toolswhich target ...