Why Floating Point Numbers May Lose Precision

Last reviewed: February 28, 1997
Article ID: Q145889
1.00 1.50 | 1.00 2.00 4.00
WINDOWS   | WINDOWS NT
kbusage kbprg

The information in this article applies to:

  • Microsoft Visual C++ for Windows, versions 1.0 and 1.5x
  • Microsoft Visual C++, 32-bit Edition, versions 1.0, 2.x, and 4.0

The information in this article is included in the documentation starting with Visual C++ 5.0. Look there for future revisions.

SUMMARY

Floating point decimal values genarally do not have an exact binary representation. This is a side effect of how the CPU represents floating point data. For this reason, you may experience some loss of precision, and some floating point operations may produce unexpected results.

This behavior is the end result of one of the following:

  • The binary representation of the decimal number may not be exact.

    -or-

  • There is a type mis-match between the numbers used (for example, mixing float and double).

To resolve the behavior, most programmers either ensure that the value is greater or less than what you need, or they get and use a Binary Coded Decimal (BCD) library that will maintain the precision.

MORE INFORMATION

Microsoft uses IEEE Floating point format for floating point number representation. For information about the actual binary representation of floating point values in a CPU and how precision and accuracy are affected in a floating point calculation, please see the following articles in the Microsoft Knowledge Base:

   ARTICLE-ID: Q36068
   TITLE     : IEEE Floating-Point Representation and MS Languages

   ARTICLE-ID: Q125056
   TITLE     : Precision and Accuracy in Floating-Point Calculations

Sample Code

/* Compile options needed:none
   Value of c is printed with a decimal point precision of 10 and
   6 (printf rounded value by default) to show the difference
*/

#include <stdio.h>
#define EPSILON 0.0001   // Define your own tolerance
#define FLOAT_EQ(x,v) (((v - EPSILON) < x) && (x <( v + EPSILON)))

void main()
{
 float a,b,c
 a=1.345f;
 b=1.123f;
 c=a+b;

//if (FLOAT_EQ(c, 2.468))        // Remove comment for correct result

 if (c == 2.468)                 //Comment this line for correct result
  printf("They are equal\n");
 else
  printf("They are not equal!!The value of c is %13.10f,or %f",c,c);
}

The Output Result

They are not equal. The value of c is 2.4679999352 or 2.468000.

For EPSILION, you may use the constants FLT_EPSILON defined for float as 1.192092896e-07F or DBL_EPSILON defined for double as 2.2204460492503131 e-016. You need to include float.h for these constants. These constants are defined as the smallest positive number x, such that x+1.0 is not equal to 1.0. Because this is a very small number it is advisable that you employ user-defined tolerance for calculations involving very large numbers. Please see "C Floating-Point Constants" article in the Microsoft Development Library for other predefined constants.


KBCategory: kbusage kbprg
KBSubcategory: CLIss VCGenIss
Additional reference words: 1.00 1.50 2.00 4.00 8.00 8.00c 9.00 9.10 10.00
floating point
Keywords : CLIss VCGenIss kbprg kbusage
Version : 1.00 1.50 | 1.00 2.00 4.00
Platform : NT WINDOWS


THE INFORMATION PROVIDED IN THE MICROSOFT KNOWLEDGE BASE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. MICROSOFT DISCLAIMS ALL WARRANTIES, EITHER EXPRESS OR IMPLIED, INCLUDING THE WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL MICROSOFT CORPORATION OR ITS SUPPLIERS BE LIABLE FOR ANY DAMAGES WHATSOEVER INCLUDING DIRECT, INDIRECT, INCIDENTAL, CONSEQUENTIAL, LOSS OF BUSINESS PROFITS OR SPECIAL DAMAGES, EVEN IF MICROSOFT CORPORATION OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. SOME STATES DO NOT ALLOW THE EXCLUSION OR LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES SO THE FOREGOING LIMITATION MAY NOT APPLY.

Last reviewed: February 28, 1997
© 1998 Microsoft Corporation. All rights reserved. Terms of Use.