Concept

Double-precision floating-point format

Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. Floating point is used to represent fractional values, or when a wider range is needed than is provided by fixed point (of the same bit width), even if at the cost of precision. Double precision may be chosen when the range or precision of single precision would be insufficient. In the IEEE 754-2008 standard, the 64-bit base-2 format is officially referred to as binary64; it was called double in IEEE 754-1985. IEEE 754 specifies additional floating-point formats, including 32-bit base-2 single precision and, more recently, base-10 representations. One of the first programming languages to provide single- and double-precision floating-point data types was Fortran. Before the widespread adoption of IEEE 754-1985, the representation and properties of floating-point data types depended on the computer manufacturer and computer model, and upon decisions made by programming-language implementers. E.g., GW-BASIC's double-precision data type was the 64-bit MBF floating-point format. Double-precision binary floating-point is a commonly used format on PCs, due to its wider range over single-precision floating point, in spite of its performance and bandwidth cost. It is commonly known simply as double. The IEEE 754 standard specifies a binary64 as having: Sign bit: 1 bit Exponent: 11 bits Significand precision: 53 bits (52 explicitly stored) The sign bit determines the sign of the number (including when this number is zero, which is signed). The exponent field is an 11-bit unsigned integer from 0 to 2047, in biased form: an exponent value of 1023 represents the actual zero. Exponents range from −1022 to +1023 because exponents of −1023 (all 0s) and +1024 (all 1s) are reserved for special numbers. The 53-bit significand precision gives from 15 to 17 significant decimal digits precision (2−53 ≈ 1.

Source officielle

https://en.wikipedia.org/wiki/Double-precision_floating-point_format

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Cours associés (16)

ME-213: Programmation pour ingénieur

Mettre en pratique les bases de la programmation vues au semestre précédent. Développer un logiciel structuré. Méthode de debug d'un logiciel. Introduction à la programmation scientifique. Introductio

CS-119(l): Information, Computation, Communication

L'objectif de ce cours est d'introduire les étudiants à la pensée algorithmique, de les familiariser avec les fondamentaux de l'Informatique et de développer une première compétence en programmation (

CS-173: Fundamentals of digital systems

Welcome to the introductory course in digital design and computer architecture. In this course, we will embark on a journey into the world of digital systems, exploring the fundamental principles and

Afficher plus

Séances de cours associées (27)

Arithmétique informatique: nombres de points flottants

Explore l'arithmétique informatique, en mettant l'accent sur les nombres de points fixes et flottants, la norme IEEE 754, la portée dynamique et les opérations de points flottants dans l'architecture MIPS.

Arithmétique de l'ordinateur: Opérations de points flottants

Couvre les bases de l'arithmétique informatique, en se concentrant sur les nombres de points flottants et leurs opérations.

Systèmes de nombres: Représentations fixes et flottantes

Discute des représentations de nombres en virgule fixe et en virgule flottante dans les systèmes numériques, en soulignant leurs structures, leurs avantages et leurs applications en informatique.

Afficher plus

Publications associées (30)

Towards General-Purpose Decentralized Computing with Permissionless Extensibility

Enis Ceyhun Alp

Smart contracts have emerged as the most promising foundations for applications of the blockchain technology. Even though smart contracts are expected to serve as the backbone of the next-generation web, they have several limitations that hinder their wide ...

EPFL2024

A 16-bit Floating-Point Near-SRAM Architecture for Low-power Sparse Matrix-Vector Multiplication

David Atienza Alonso, Giovanni Ansaloni, Grégoire Axel Eggermann, Marco Antonio Rios

State-of-the-art Artificial Intelligence (AI) algorithms, such as graph neural networks and recommendation systems, require floating-point computation of very large matrix multiplications over sparse data. Their execution in resource-constrained scenarios, ...

New York2023

Non-contact robotic manipulation of floating objects: exploiting emergent limit cycles

Josephine Anna Eleanor Hughes, Nana Obayashi

The study of non-contact manipulation in water, and the ability to robotically control floating objects has gained recent attention due to wide-ranging potential applications, including the analysis of plastic pollution in the oceans and the optimization o ...

Frontiers Media Sa2023

Afficher plus

Source officielle

https://en.wikipedia.org/wiki/Double-precision_floating-point_format

À propos de ce résultat

Proximité ontologique

Mathématiques

Analyse (mathématiques): Analyse numérique

Cours associés (16)

ME-213: Programmation pour ingénieur

CS-119(l): Information, Computation, Communication

CS-173: Fundamentals of digital systems

Welcome to the introductory course in digital design and computer architecture. In this course, we will embark on a journey into the world of digital systems, exploring the fundamental principles and

Afficher plus

Séances de cours associées (27)

Arithmétique informatique: nombres de points flottants

Arithmétique de l'ordinateur: Opérations de points flottants

Couvre les bases de l'arithmétique informatique, en se concentrant sur les nombres de points flottants et leurs opérations.

Systèmes de nombres: Représentations fixes et flottantes

Discute des représentations de nombres en virgule fixe et en virgule flottante dans les systèmes numériques, en soulignant leurs structures, leurs avantages et leurs applications en informatique.

Afficher plus

Publications associées (30)

Towards General-Purpose Decentralized Computing with Permissionless Extensibility

Enis Ceyhun Alp

EPFL2024

A 16-bit Floating-Point Near-SRAM Architecture for Low-power Sparse Matrix-Vector Multiplication

David Atienza Alonso, Giovanni Ansaloni, Grégoire Axel Eggermann, Marco Antonio Rios

New York2023

Non-contact robotic manipulation of floating objects: exploiting emergent limit cycles

Josephine Anna Eleanor Hughes, Nana Obayashi

Frontiers Media Sa2023

Afficher plus

Concepts associés (21)

Arrondi (mathématiques)

Arrondir un nombre consiste à le remplacer par un autre nombre considéré comme plus simple ou plus pertinent. Ce procédé s'appelle arrondissage ou arrondissement et le nombre obtenu est un arrondi. Le résultat est moins précis, mais plus facile à employer. Il y a plusieurs façons d'arrondir. En général, on arrondit un nombre en en donnant une valeur approchée obtenue à partir de son développement décimal en réduisant le nombre de chiffres significatifs. L'arrondi peut se faire au plus proche, par excès ou par défaut.

Single-precision floating-point format

Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. A floating-point variable can represent a wider range of numbers than a fixed-point variable of the same bit width at the cost of precision. A signed 32-bit integer variable has a maximum value of 231 − 1 = 2,147,483,647, whereas an IEEE 754 32-bit base-2 floating-point variable has a maximum value of (2 − 2−23) × 2127 ≈ 3.

Dépassement d'entier

vignette|Le vol 501 d'Ariane 5 en 1996 s'est soldé par sa destruction en raison d'un dépassement d'entier. Un dépassement d'entier (integer overflow) est, en informatique, une condition qui se produit lorsqu'une opération mathématique produit une valeur numérique supérieure à celle représentable dans l'espace de stockage disponible. Par exemple, l'ajout d'une unité au plus grand nombre pouvant être représenté entraîne un dépassement d'entier. Le dépassement d'entier porte le numéro CWE-190 dans la nomenclature Common Weakness Enumeration.

Afficher plus