**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Bucket sort

Summary

Bucket sort, or bin sort, is a sorting algorithm that works by distributing the elements of an array into a number of buckets. Each bucket is then sorted individually, either using a different sorting algorithm, or by recursively applying the bucket sorting algorithm. It is a distribution sort, a generalization of pigeonhole sort that allows multiple keys per bucket, and is a cousin of radix sort in the most-to-least significant digit flavor. Bucket sort can be implemented with comparisons and therefore can also be considered a comparison sort algorithm. The computational complexity depends on the algorithm used to sort each bucket, the number of buckets to use, and whether the input is uniformly distributed.
Bucket sort works as follows:
Set up an array of initially empty "buckets".
Scatter: Go over the original array, putting each object in its bucket.
Sort each non-empty bucket.
Gather: Visit the buckets in order and put all elements back into the original array.
function bucketSort(array, k) is
buckets ← new array of k empty lists
M ← 1 + the maximum key value in the array
for i = 0 to length(array) do
insert array[i] into buckets[floor(k × array[i] / M)]
for i = 0 to k do
nextSort(buckets[i])
return the concatenation of buckets[0], ...., buckets[k]
Let array denote the array to be sorted and k denote the number of buckets to use. One can compute the maximum key value in linear time by iterating over all the keys once. The floor function must be used to convert a floating number to an integer ( and possibly casting of datatypes too ). The function nextSort is a sorting function used to sort each bucket. Conventionally, insertion sort is used, but other algorithms could be used as well, such as selection sort or merge sort. Using bucketSort itself as nextSort produces a relative of radix sort; in particular, the case n = 2 corresponds to quicksort (although potentially with poor pivot choices).
When the input contains several keys that are close to each other (clustering), those elements are likely to be placed in the same bucket, which results in some buckets containing more elements than average.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications (45)

Related MOOCs (2)

Related courses (32)

Related concepts (5)

Related people (9)

Related lectures (203)

BIO-378: Physiology lab I

Le TP de physiologie introduit les approches expérimentales du domaine biomédical, avec les montages de mesure, les capteurs, le conditionnement des signaux, l'acquisition et traitement de données.
Le

BIO-379: Physiology lab II

Le TP de physiologie introduit les approches expérimentales du domaine biomédical, avec les montages de mesure, les capteurs, le conditionnement des signaux, l'acquisition et traitement de données.
Le

CS-101: Advanced information, computation, communication I

Discrete mathematics is a discipline with applications to almost all areas of study. It provides a set of indispensable tools to computer science in particular. This course reviews (familiar) topics a

Geographical Information Systems 1

Organisé en deux parties, ce cours présente les bases théoriques et pratiques des systèmes d’information géographique, ne nécessitant pas de connaissances préalables en informatique. En suivant cette

Geographical Information Systems 1

Organisé en deux parties, ce cours présente les bases théoriques et pratiques des systèmes d’information géographique, ne nécessitant pas de connaissances préalables en informatique. En suivant cette

, ,

Droplet microfluidics has revolutionized quantitative high-throughput bioassays and screening, especially in the field of single-cell analysis where applications include cell characterization, antibody discovery and directed evolution. However, droplet mic ...

Comparison sort

A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation (often a "less than or equal to" operator or a three-way comparison) that determines which of two elements should occur first in the final sorted list. The only requirement is that the operator forms a total preorder over the data, with: if a ≤ b and b ≤ c then a ≤ c (transitivity) for all a and b, a ≤ b or b ≤ a (connexity). It is possible that both a ≤ b and b ≤ a; in this case either may come first in the sorted list.

Hybrid algorithm

A hybrid algorithm is an algorithm that combines two or more other algorithms that solve the same problem, either choosing one based on some characteristic of the data, or switching between them over the course of the algorithm. This is generally done to combine desired features of each, so that the overall algorithm is better than the individual components. "Hybrid algorithm" does not refer to simply combining multiple algorithms to solve a different problem – many algorithms can be considered as combinations of simpler pieces – but only to combining algorithms that solve the same problem, but differ in other characteristics, notably performance.

Radix sort

In computer science, radix sort is a non-comparative sorting algorithm. It avoids comparison by creating and distributing elements into buckets according to their radix. For elements with more than one significant digit, this bucketing process is repeated for each digit, while preserving the ordering of the prior step, until all digits have been considered. For this reason, radix sort has also been called bucket sort and digital sort. Radix sort can be applied to data that can be sorted lexicographically, be they integers, words, punch cards, playing cards, or the mail.

Sylow Subgroups: Structure and Properties

Explores the properties and structure of Sylow subgroups in group theory, emphasizing a theorem-independent approach.

Excel Data Analysis and Forecasting

Covers the basics of Excel data analysis and forecasting techniques.

Touradj Ebrahimi, Evgeniy Upenik, Davi Nachtigall Lazzarotto

Non fungible tokens (NFTs) are used to define the ownership of digital assets. More recently, there has been a surge of platforms to auction digital art as well as other digital assets in form of image, video, and audio content of all sorts. Although NFTs ...

2021Stéphane Joost, Oliver Michele Selmoni, Véronique Berteaux-Lecellier

Coral reefs around the world are under threat from anomalous heat waves that are causing the widespread decline of hard corals. Different coral taxa are known to have different sensitivities to heat, although variation in susceptibilities have also been ob ...