Concept

Content-addressable storage

Summary
Content-addressable storage (CAS), also referred to as content-addressed storage or fixed-content storage, is a way to store information so it can be retrieved based on its content, not its name or location. It has been used for high-speed storage and retrieval of fixed content, such as documents stored for compliance with government regulations. Content-addressable storage is similar to content-addressable memory. CAS systems work by passing the content of the file through a cryptographic hash function to generate a unique key, the "content address". The 's directory stores these addresses and a pointer to the physical storage of the content. Because an attempt to store the same file will generate the same key, CAS systems ensure that the files within them are unique, and because changing the file will result in a new key, CAS systems provide assurance that the file is unchanged. CAS became a significant market during the 2000s, especially after the introduction of the 2002 Sarbanes–Oxley Act which required the storage of enormous numbers of documents for long periods and retrieved only rarely. Ever-increasing performance of traditional file systems and new software systems have eroded the value of legacy CAS systems, which have become increasingly rare after roughly 2018. However, the principles of content addressability continue to be of great interest to computer scientists, and form the core of numerous emerging technologies, such as , cryptocurrencies, and distributed computing. Traditional s generally track files based on their . On random-access media like a floppy disk, this is accomplished using a directory that consists of some sort of list of filenames and pointers to the data. The pointers refer to a physical location on the disk, normally using disk sectors. On more modern systems and larger formats like hard drives, the directory is itself split into many subdirectories, each tracking a subset of the overall collection of files. Subdirectories are themselves represented as files in a parent directory, producing a hierarchy or tree-like organization.
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (1)
CS-438: Decentralized systems engineering
A decentralized system is one that works when no single party is in charge or fully trusted. This course teaches decentralized systems principles while guiding students through the engineering of thei
Related lectures (1)
Related concepts (1)
Merkle tree
In cryptography and computer science, a hash tree or Merkle tree is a tree in which every "leaf" (node) is labelled with the cryptographic hash of a data block, and every node that is not a leaf (called a branch, inner node, or inode) is labelled with the cryptographic hash of the labels of its child nodes. A hash tree allows efficient and secure verification of the contents of a large data structure. A hash tree is a generalization of a hash list and a hash chain.