**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Hash function

Summary

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.
Hash functions and their associated hash tables are used in data storage and retrieval applications to access data in a small and nearly constant time per retrieval. They require an amount of storage space only fractionally greater than the total space required for the data or records themselves. Hashing is a computationally and storage space-efficient form of data access that avoids the non-constant access time of ordered and unordered lists and structured trees, and the often exponential storage requirements of direct access of state spaces

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications (100)

Loading

Loading

Loading

Related people (22)

Related units

Loading

Related concepts

Loading

Related courses (66)

CS-450: Advanced algorithms

A first graduate course in algorithms, this course assumes minimal background, but moves rapidly. The objective is to learn the main techniques of algorithm analysis and design, while building a repertory of basic algorithmic solutions to problems in many domains.

CS-322: Introduction to database systems

This course provides a deep understanding of the concepts behind data management systems. It covers fundamental data management topics such as system architecture, data models, query processing and optimization, database design, storage organization, and transaction management.

COM-401: Cryptography and security

This course introduces the basics of cryptography. We review several types of cryptographic primitives, when it is safe to use them and how to select the appropriate security parameters. We detail how they work and sketch how they can be implemented.

Related lectures

Loading

Related units (15)

Related concepts (41)

Hash table

In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses

Cryptographic hash function

A cryptographic hash function (CHF) is a hash algorithm (a map of an arbitrary binary string to a binary string with a fixed size of n bits) that has special properties desirable for a

Cryptography

Cryptography, or cryptology (from κρυπτός "hidden, secret"; and γράφειν graphein, "to write", or -λογία -logia, "study", respectively), is the practice and study of techniques for secure communicatio

We reconsider the provably collision resistant Very Smooth Hash and propose a small change in the design aiming to improve both performance and security. While the original proofs of security based on hardness of factoring or discrete logarithms are preserved, we can base the security on the k-sum problem studied by Wagner and more recently by Minder Sz. Sinclair. The new approach allows to output shorter digests and brings the speed of Fast VSH closer to the range of "classical" hash functions. The modified VSH is likely to remain secure even if factoring and discrete logarithms are easy, while this would have a devastating effect on the original versions. This observation leads us to propose a variant that operates modulo a power of two to increase the speed even more. A function that offers an equivalent of 128-bit collision resistance runs at 68.5 MB/s on a 2.4 CHz Intel Core 2 CPU, more than a third of the speed of SH A-256.

We provide a concrete security treatment of several "provably secure" hash functions. Interpreting arguments behind MQ-HASH, FSB, SWIFFTX and VSH we identify similar lines of reasoning. We aim to formulate the main security claims in a language closer to that of attacks. We evaluate designers' claims of provable security and quantify them more precisely, deriving "second order" bounds on bounds. While the authors of FSB, MQ-HASH and SWIFFT(X) prove existence of non-trivial lower bounds on security, we show that the quantification of the bounds limits the practical significance of the proofs.

We consider several "provably secure" hash functions that compute simple sums in a well chosen group (G,*). Security properties of such functions provably translate in a natural way to computational problems in G that are simple to define and possibly also hard to solve. Given k disjoint lists Li of group elements, the k-sum problem asks for gi ∊ Li such that g1 * g2 *...* gk = 1G. Hardness of the problem in the respective groups follows from some "standard" assumptions used in public-key cryptology such as hardness of integer factoring, discrete logarithms, lattice reduction and syndrome decoding. We point out evidence that the k-sum problem may even be harder than the above problems. Two hash functions based on the group k-sum problem, SWIFFTX and FSB, were submitted to NIST as candidates for the future SHA-3 standard. Both submissions were supported by some sort of a security proof. We show that the assessment of security levels provided in the proposals is not related to the proofs included. The main claims on security are supported exclusively by considerations about available attacks. By introducing "second-order" bounds on bounds on security, we expose the limits of such an approach to provable security. A problem with the way security is quantified does not necessarily mean a problem with security itself. Although FSB does have a history of failures, recent versions of the two above functions have resisted cryptanalytic efforts well. This evidence, as well as the several connections to more standard problems, suggests that the k-sum problem in some groups may be considered hard on its own, and possibly lead to provable bounds on security. Complexity of the non-trivial tree algorithm is becoming a standard tool for measuring the associated hardness. We propose modifications to the multiplicative Very Smooth Hash and derive security from multiplicative k-sums in contrast to the original reductions that related to factoring or discrete logarithms. Although the original reductions remain valid, we measure security in a new, more aggressive way. This allows us to relax the parameters and hash faster. We obtain a function that is only three times slower compared to SHA-256 and is estimated to offer at least equivalent collision resistance. The speed can be doubled by the use of a special modulus, such a modified function is supported exclusively by the hardness of multiplicative k-sums modulo a power of two. Our efforts culminate in a new multiplicative k-sum function in finite fields that further generalizes the design of Very Smooth Hash. In contrast to the previous variants, the memory requirements of the new function are negligible. The fastest instance of the function expected to offer 128-bit collision resistance runs at 24 cycles per byte on an Intel Core i7 processor and approaches the 17.4 figure of SHA-256. The new functions proposed in this thesis do not provably achieve a usual security property such as preimage or collision resistance from a well-established assumption. They do however enjoy unconditional provable separation of inputs that collide. Changes in input that are small with respect to a well defined measure never lead to identical output in the compression function.

Related lectures (201)