Floating Point Error Ieee 754
Contents |
by David Goldberg, published in the March, 1991 issue of Computing Surveys. Copyright 1991, Association for Computing Machinery, Inc., reprinted by permission. Abstract Floating-point arithmetic is considered
Floating Point Python
an esoteric subject by many people. This is rather surprising because floating-point floating point number example is ubiquitous in computer systems. Almost every language has a floating-point datatype; computers from PCs to supercomputers have floating-point
Floating Point Arithmetic Examples
accelerators; most compilers will be called upon to compile floating-point algorithms from time to time; and virtually every operating system must respond to floating-point exceptions such as overflow. This paper floating point calculator presents a tutorial on those aspects of floating-point that have a direct impact on designers of computer systems. It begins with background on floating-point representation and rounding error, continues with a discussion of the IEEE floating-point standard, and concludes with numerous examples of how computer builders can better support floating-point. Categories and Subject Descriptors: (Primary) C.0 [Computer Systems Organization]: General double floating point -- instruction set design; D.3.4 [Programming Languages]: Processors -- compilers, optimization; G.1.0 [Numerical Analysis]: General -- computer arithmetic, error analysis, numerical algorithms (Secondary) D.2.1 [Software Engineering]: Requirements/Specifications -- languages; D.3.4 Programming Languages]: Formal Definitions and Theory -- semantics; D.4.1 Operating Systems]: Process Management -- synchronization. General Terms: Algorithms, Design, Languages Additional Key Words and Phrases: Denormalized number, exception, floating-point, floating-point standard, gradual underflow, guard digit, NaN, overflow, relative error, rounding error, rounding mode, ulp, underflow. Introduction Builders of computer systems often need information about floating-point arithmetic. There are, however, remarkably few sources of detailed information about it. One of the few books on the subject, Floating-Point Computation by Pat Sterbenz, is long out of print. This paper is a tutorial on those aspects of floating-point arithmetic (floating-point hereafter) that have a direct connection to systems building. It consists of three loosely connected parts. The first section, Rounding Error, discusses the implications of using different rounding strategies for the basic operations of addition, subtraction, multiplication and division. It also contains background information on the two methods of measuring rounding error,
here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings and policies of
Floating Point Numbers Explained
this site About Us Learn more about Stack Overflow the company Business floating point rounding error Learn more about hiring developers or posting ads with us Stack Overflow Questions Jobs Documentation Tags Users Badges Ask
Floating Point Representation
Question x Dismiss Join the Stack Overflow Community Stack Overflow is a community of 4.7 million programmers, just like you, helping each other. Join them; it only takes a minute: Sign up https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html IEEE-754 floating-point precision: How much error is allowed? up vote 3 down vote favorite 1 I'm working on porting the sqrt function (for 64-bit doubles) from fdlibm to a model-checker tool I'm using at the moment (cbmc). As part of my doings, I read a lot about the ieee-754 standard, but I think I didn't understand the guarantees of precision for the basic http://stackoverflow.com/questions/4317988/ieee-754-floating-point-precision-how-much-error-is-allowed operations (incl. sqrt). Testing my port of fdlibm's sqrt, I got the following calculation with sqrt on a 64-bit double: sqrt(1977061516825203605555216616167125005658976571589721139027150498657494589171970335387417823661417383745964289845929120708819092392090053015474001800648403714048.0) = 44464159913633855548904943164666890000299422761159637702558734139742800916250624.0 (this case broke a simple post-condition in my test regarding precision; I'm not sure anymore if this post-condition is possible with IEEE-754) For a comparison, several multi-precision tools calculated something like: sqrt(1977061516825203605555216616167125005658976571589721139027150498657494589171970335387417823661417383745964289845929120708819092392090053015474001800648403714048.0) =44464159913633852501611468455197640079591886932526256694498106717014555047373210.truncated One can see, that the 17-th number from the left is different, meaning an error like: 3047293474709469249920707535828633381008060627422728245868877413.0 Question 1: Is this huge amount of error allowed? The standard is saying that every basic operation (+,-,*,/,sqrt) should be within 0.5 ulps, meaning that it should be equal to a mathematically exact result rounded to the nearest fp-representation (wiki is saying that some libraries only guarantees 1 ulp, but that isn't that important at the moment). Question 2: Does that mean, that every basic operation should have an error < 2.220446e-16 with 64-bit doubles (machine-epsilon)? I did calculate the same with a x86-32 linux system (glibc / eglibc) and got the same result like that obtained with fdlibm, which let me think that: a: I did something wrong (but how: printf would b
base 2 (binary) fractions. For example, the decimal fraction 0.125 has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction 0.001 has value 0/2 + 0/4 + 1/8. These two fractions have identical values, the only real difference being that the https://docs.python.org/2/tutorial/floatingpoint.html first is written in base 10 fractional notation, and the second in base 2. Unfortunately, most decimal fractions cannot be represented exactly as binary fractions. A consequence is that, in general, the decimal floating-point numbers you enter are only approximated by the binary floating-point numbers actually stored in the machine. The problem is easier to understand at first in base 10. Consider the fraction 1/3. You can approximate that as a base 10 fraction: 0.3 or, floating point better, 0.33 or, better, 0.333 and so on. No matter how many digits you're willing to write down, the result will never be exactly 1/3, but will be an increasingly better approximation of 1/3. In the same way, no matter how many base 2 digits you're willing to use, the decimal value 0.1 cannot be represented exactly as a base 2 fraction. In base 2, 1/10 is the infinitely repeating fraction 0.0001100110011001100110011001100110011001100110011... Stop at any finite number floating point number of bits, and you get an approximation. On a typical machine running Python, there are 53 bits of precision available for a Python float, so the value stored internally when you enter the decimal number 0.1 is the binary fraction 0.00011001100110011001100110011001100110011001100110011010 which is close to, but not exactly equal to, 1/10. It's easy to forget that the stored value is an approximation to the original decimal fraction, because of the way that floats are displayed at the interpreter prompt. Python only prints a decimal approximation to the true decimal value of the binary approximation stored by the machine. If Python were to print the true decimal value of the binary approximation stored for 0.1, it would have to display >>> 0.1 0.1000000000000000055511151231257827021181583404541015625 That is more digits than most people find useful, so Python keeps the number of digits manageable by displaying a rounded value instead >>> 0.1 0.1 It's important to realize that this is, in a real sense, an illusion: the value in the machine is not exactly 1/10, you're simply rounding the display of the true machine value. This fact becomes apparent as soon as you try to do arithmetic with these values >>> 0.1 + 0.2 0.30000000000000004 Note that this is in the very nature of binary floating-point: this is not a bug in Python, and it is not a bug in your code ei