Concept

Half-precision floating-point format

In computing, half precision (sometimes called FP16 or float16) is a binary floating-point computer number format that occupies 16 bits (two bytes in modern computers) in computer memory. It is intended for storage of floating-point values in applications where higher precision is not essential, in particular and neural networks. Almost all modern uses follow the IEEE 754-2008 standard, where the 16-bit base-2 format is referred to as binary16, and the exponent uses 5 bits. This can express values in the range ±65,504, with the minimum value above 1 being 1 + 1/1024. Depending on the computer, half-precision can be over an order of magnitude faster than double precision, e.g. 550 PFLOPS for half-precision vs 37 PFLOPS for double precision on one cloud provider. Several earlier 16-bit floating point formats have existed including that of Hitachi's HD61810 DSP of 1982 (a 4-bit exponent and a 12-bit mantissa), Thomas J. Scott's WIF of 1991 (5 exponent bits, 10 mantissa bits) and the 3dfx Voodoo Graphics processor of 1995 (same as Hitachi). ILM was searching for an image format that could handle a wide dynamic range, but without the hard drive and memory cost of single or double precision floating point. The hardware-accelerated programmable shading group led by John Airey at SGI (Silicon Graphics) invented the s10e5 data type in 1997 as part of the 'bali' design effort. This is described in a SIGGRAPH 2000 paper (see section 4.3) and further documented in US patent 7518615. It was popularized by its use in the open-source OpenEXR image format. Nvidia and Microsoft defined the half datatype in the Cg language, released in early 2002, and implemented it in silicon in the GeForce FX, released in late 2002. Since then support for 16-bit floating point math in graphics cards has become very common. The F16C extension in 2012 allows x86 processors to convert half-precision floats to and from single-precision floats with a machine instruction.

Source officielle

https://en.wikipedia.org/wiki/Half-precision_floating-point_format

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Cours associés (12)

MATH-126: Geometry for architects II

Ce cours traite des 3 sujets suivants : la perspective, la géométrie descriptive, et une initiation à la géométrie projective.

CS-119(l): Information, Computation, Communication

L'objectif de ce cours est d'introduire les étudiants à la pensée algorithmique, de les familiariser avec les fondamentaux de l'Informatique et de développer une première compétence en programmation (

CS-173: Fundamentals of digital systems

Welcome to the introductory course in digital design and computer architecture. In this course, we will embark on a journey into the world of digital systems, exploring the fundamental principles and

Afficher plus

Publications associées (26)

Towards General-Purpose Decentralized Computing with Permissionless Extensibility

Enis Ceyhun Alp

Smart contracts have emerged as the most promising foundations for applications of the blockchain technology. Even though smart contracts are expected to serve as the backbone of the next-generation web, they have several limitations that hinder their wide ...

EPFL2024

Functional-Basis Analysis of Non-Stationary Signals in Modern Power Grids: Theory and Implementation in Embedded Systems

Alexandra Cameron Karpilow

Situational awareness strategies are essential for the reliable and secure operation of the electric power grid which represents critical infrastructure in modern society. With the rise of converter-interfaced renewable generation and the consequent shift ...

EPFL2024

Low-Power Artificial Neural Network Perceptron Based on Monolayer MoS2

Aleksandra Radenovic, Andras Kis, Mukesh Kumar Tripathi, Guilherme Migliato Marega, Zhenyu Wang

Machine learning and signal processing on the edge are poised to influence our everyday lives with devices that will learn and infer from data generated by smart sensors and other devices for the Internet of Things. The next leap toward ubiquitous electron ...

2022

Afficher plus

Concepts associés (3)

Single-precision floating-point format

Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. A floating-point variable can represent a wider range of numbers than a fixed-point variable of the same bit width at the cost of precision. A signed 32-bit integer variable has a maximum value of 231 − 1 = 2,147,483,647, whereas an IEEE 754 32-bit base-2 floating-point variable has a maximum value of (2 − 2−23) × 2127 ≈ 3.

IEEE 754

En informatique, l’IEEE 754 est une norme sur l'arithmétique à virgule flottante mise au point par le Institute of Electrical and Electronics Engineers. Elle est la norme la plus employée actuellement pour le calcul des nombres à virgule flottante avec les CPU et les FPU. La norme définit les formats de représentation des nombres à virgule flottante (signe, mantisse, exposant, nombres dénormalisés) et valeurs spéciales (infinis et NaN), en même temps qu’un ensemble d’opérations sur les nombres flottants.

Double-precision floating-point format

Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. Floating point is used to represent fractional values, or when a wider range is needed than is provided by fixed point (of the same bit width), even if at the cost of precision. Double precision may be chosen when the range or precision of single precision would be insufficient.

Source officielle

https://en.wikipedia.org/wiki/Half-precision_floating-point_format

À propos de ce résultat

Cours associés (12)

MATH-126: Geometry for architects II

Ce cours traite des 3 sujets suivants : la perspective, la géométrie descriptive, et une initiation à la géométrie projective.

CS-119(l): Information, Computation, Communication

CS-173: Fundamentals of digital systems

Welcome to the introductory course in digital design and computer architecture. In this course, we will embark on a journey into the world of digital systems, exploring the fundamental principles and

Afficher plus

Séances de cours associées (27)

Arithmétique de l'ordinateur: Opérations de points flottants

Couvre les bases de l'arithmétique informatique, en se concentrant sur les nombres de points flottants et leurs opérations.

Arithmétique informatique: nombres de points flottants

Explore l'arithmétique informatique, en mettant l'accent sur les nombres de points fixes et flottants, la norme IEEE 754, la portée dynamique et les opérations de points flottants dans l'architecture MIPS.

Systèmes de nombres: Représentations fixes et flottantes

Discute des représentations en virgule fixe et en virgule flottante dans les systèmes numériques, couvrant des concepts clés tels que la précision, la précision et la norme IEEE 754.

Afficher plus

Publications associées (26)

Afficher plus

Concepts associés (3)

Single-precision floating-point format

IEEE 754

Double-precision floating-point format