Floating-point arithmeticIn computing, floating-point arithmetic (FP) is arithmetic that represents subsets of real numbers using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. Numbers of this form are called floating-point numbers. For example, 12.345 is a floating-point number in base ten with five digits of precision: However, unlike 12.345, 12.3456 is not a floating-point number in base ten with five digits of precision—it needs six digits of precision; the nearest floating-point number with only five digits is 12.
Fixed-point arithmeticIn computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with exactly two fractional digits, representing the cents (1/100 of dollar). More generally, the term may refer to representing fractional values as integer multiples of some fixed small unit, e.g. a fractional amount of hours as an integer multiple of ten-minute intervals.
Quadruple-precision floating-point formatIn computing, quadruple precision (or quad precision) is a binary floating point–based computer number format that occupies 16 bytes (128 bits) with precision at least twice the 53-bit double precision. This 128-bit quadruple precision is designed not only for applications requiring results in higher than double precision, but also, as a primary function, to allow the computation of double precision results more reliably and accurately by minimising overflow and round-off errors in intermediate calculations and scratch variables.
Arbitrary-precision arithmeticIn computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose digits of precision are limited only by the available memory of the host system. This contrasts with the faster fixed-precision arithmetic found in most arithmetic logic unit (ALU) hardware, which typically offers between 8 and 64 bits of precision.
Single-precision floating-point formatSingle-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. A floating-point variable can represent a wider range of numbers than a fixed-point variable of the same bit width at the cost of precision. A signed 32-bit integer variable has a maximum value of 231 − 1 = 2,147,483,647, whereas an IEEE 754 32-bit base-2 floating-point variable has a maximum value of (2 − 2−23) × 2127 ≈ 3.
Floating-point unitA floating-point unit (FPU, colloquially a math coprocessor) is a part of a computer system specially designed to carry out operations on floating-point numbers. Typical operations are addition, subtraction, multiplication, division, and square root. Some FPUs can also perform various transcendental functions such as exponential or trigonometric calculations, but the accuracy can be very low, so that some systems prefer to compute these functions in software.
Half-precision floating-point formatIn computing, half precision (sometimes called FP16 or float16) is a binary floating-point computer number format that occupies 16 bits (two bytes in modern computers) in computer memory. It is intended for storage of floating-point values in applications where higher precision is not essential, in particular and neural networks. Almost all modern uses follow the IEEE 754-2008 standard, where the 16-bit base-2 format is referred to as binary16, and the exponent uses 5 bits.
Decimal floating pointDecimal floating-point (DFP) arithmetic refers to both a representation and operations on decimal floating-point numbers. Working directly with decimal (base-10) fractions can avoid the rounding errors that otherwise typically occur when converting between decimal fractions (common in human-entered data, such as measurements or financial information) and binary (base-2) fractions. The advantage of decimal floating-point representation over decimal fixed-point and integer representation is that it supports a much wider range of values.
Dynamical systemIn mathematics, a dynamical system is a system in which a function describes the time dependence of a point in an ambient space, such as in a parametric curve. Examples include the mathematical models that describe the swinging of a clock pendulum, the flow of water in a pipe, the random motion of particles in the air, and the number of fish each springtime in a lake. The most general definition unifies several concepts in mathematics such as ordinary differential equations and ergodic theory by allowing different choices of the space and how time is measured.
Dynamical systems theoryDynamical systems theory is an area of mathematics used to describe the behavior of complex dynamical systems, usually by employing differential equations or difference equations. When differential equations are employed, the theory is called continuous dynamical systems. From a physical point of view, continuous dynamical systems is a generalization of classical mechanics, a generalization where the equations of motion are postulated directly and are not constrained to be Euler–Lagrange equations of a least action principle.